Andrew Guyton's Blog

This is a sequel to my article introducing feed43. I decided that I wanted to make an RSS feed that showed sales on Steam, as some of them are amazing and I can’t be bothered to go checking the Steam store every weekend. If you want to skip the explanation, here’s the feed: Steam Sales.

Extraction

After I decided that I wanted to make one, I attempted to extract the information from the Steam store main page, but I ran into the 100kb page size limit of free feeds; I then tried the page specifically about Steam specials but that ran into the size limit as well. I then created a YQL query to extract the relevant part of the page:

select * from html where url=”http://store.steampowered.com/search/?sort_by=Price&sort_order=ASC&specials=1″ and xpath=”//div[@id=’search_results’]”

Now that we’re within the size limit, we can extract the relevant parts of the page. My goal was to reconstruct the view from the sale page, so we need the game image, the title, the price, the platform, and the description. The trick is that I had to combine {*} fields as I was running against the free limit of 20 instances of {%} or {*}; the following code has 19 of them. Here’s the extraction I ended up using:

<a class="{*}" href="{%}"/>{*}<strike>${%}</strike>{*}<p>${%}</p>{*}<p>{%}</p>{*}<img {*} src="{%}" {*}<h4>{%}</h4>{*}<img class="platform_img" {*} src="{%}" {*}/>{%}-{*}

Output

And here’s the item template. Yeah, tables suck, but it was the easy solution. If you can create something better in an RSS feed, feel free to do so:

<table border="1" cellspacing="0" cellpadding="0" width="100%"><tr><td rowspan="2" width="90"><img src="{%5}"/></td><td>&nbsp;<b>{%6}</b></td><td rowspan="2" width="70"><center><strike>${%2}</strike></br><b>${%3}</b></center></td></tr><tr><td>&nbsp;<img src="{%7}"/> {%8}</td></tr></table>

Potential improvements

If there was some way to show the sale percentage (new price / old price), then that would make a nice addition, as that fact isn’t on the page I scraped for this. I’d also think a nicer display would be possible, something as snazzy as what is present on the actual page.

Leave a Reply

 

Powered by WordPress and theme developed using WordPress Theme Generator.
Copyright © Andrew Guyton. All rights reserved. Feeds: Entries (RSS) and Comments (RSS).