I am a regular of GitHub Explore/Trending pages, checking multiple times every single day. They are good places to learn about new projects, but most of the trending repos are the same as the day before. So, I spent quite some time, just to scrolling through the repos which I have already checked out.

Didn’t want to waste more time and also want to have the simplest access, I wrote a script gh-trend.py using feeds from GitHub Trends RSS1. It will check the languages you are interested in and the time period you ask for, then list anything thing are new to standard output. The repos are saved in a JSON, so same repo will not be shown again.

A quick example command:


$ gh-trend.py -j /path/to/saved.json -p this-week all bash c cpp

You can list as many languages as you like, separate the JSONs on different settings or against the same one. I don’t code it to use configuration file, because I don’t want one.

Currently, I am using it with cron and check the generated files same way I open the reports of cron tasks. The scrip is run daily (today), weekly (this-week), and monthly (this-month).

The following crontab example shows a possible usage:


@daily /path/to/gh-trend.py -j /path/to/saved.json all python bash >> /path/to/output.txt
@weekly /path/to/gh-trend.py -j /path/to/saved.json -p this-week all python bash >> /path/to/output.txt
@monthly /path/to/gh-trend.py -j /path/to/saved.json -p this-month all python bash >> /path/to/output.txt

I don’t call the script directly, but wrap with a Bash script in order to generate files with timestamp included in filenames, also to convert the plain text into partial HTML format using AWK:


gh-trend.py [options] | awk "
/^https/ {
print(\"<a href='\" \$0 \"'>\"\
gensub(\"https://github.com/\", \"@\", \"\", \$0)\
\"</a><br/>\");
next;
}
/^.+$/ {
print(\"<span>\" \$0 \"</span><br/>\");
next;
}
/^$/ {
print(\"<br/>\");
}
"
> output.html

The JSON file stores the username/reponame plus the description. Not sure if the JSON will explode in the future, but I can always modify the code to remove the description, which isn’t used actually after saved.

The code is pretty simple, it just checks to see if a repo is new. If so, then prints it out, and that’s all it does. Nothing fancy, but it won’t let you miss a repo that you might be interested in.

By the way, if you are also a Bitbucket user and wants it to bring back the Explore page—which has gone for more than a year, vote this issue up.


Updated 2013-10-31T05:01:17Z: I revised the HTML conversion, adding new shell code in gh-trend.py, it can now render:

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6SulB5X6ZiL20krHAAw3yLWfLzL1GFozIELzCpi2_Mg-0t0gU5M9Qf3MibXlNeu7DtIzeuwTailpAu5q0BvnzjSfsPxazACTHhaAj1ndiFVXkkHdxTq6uSetCByUTORLfZcMsj4uJuSA/s800/gh-trend%2520html%25202013-10-31--12%253A26%253A41.png

[1]Sadly, there is no Trending API, have to parse third-party feeds.