Python Forum
Git clone all of a Github user's public repositories (download all repositories) - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: General (https://python-forum.io/forum-1.html)
+--- Forum: Code sharing (https://python-forum.io/forum-5.html)
+--- Thread: Git clone all of a Github user's public repositories (download all repositories) (/thread-19336.html)

Pages: 1 2 3 4


RE: Git clone all of a Github user's public repositories (download all repositories) - rootVIII - Jul-04-2019

Axel your gists are awesome.

So I was originally scraping a line that contained the phrase "repository_nwo" to identify lines with repo names. For some reason that line doesn't show up in the HTML of everyone's repository page. I thought it might have to do with being logged in vs. logged out... but no. So now it just looks for each link in the html. Returning as a set may not be necessary but left it just in case. My apologies again.

Also I might add a check in the future so that it doesn't attempt to download forked repositories


RE: Git clone all of a Github user's public repositories (download all repositories) - Axel_Erfurt - Jul-04-2019

Thanks, it works now.


RE: Git clone all of a Github user's public repositories (download all repositories) - Skaperen - Jul-05-2019

(Jul-04-2019, 06:15 PM)rootVIII Wrote: Yup it's the regex... should have a fix in a little bit... sorry for that
ugh... those "unanticipated requirements", again.


RE: Git clone all of a Github user's public repositories (download all repositories) - rootVIII - Jul-05-2019

Hahaha I know. Sometimes I get too excited and post the code to early. Anyways, this one will only attempt to download the user's own repos and not forks. I changed the URL to display only source repos. It might be fun to get the forked ones too though.

# rootVIII
# Download/clone all of a user's public source repositories
# Pass the Github user's username with the -u option
# Usage: python git_clones.py -u <github username>
# Example: python git_clones.py -u rootVIII
#
from argparse import ArgumentParser
from sys import exit, version_info
from re import findall
from subprocess import call
try:
    from urllib.request import urlopen
except ImportError:
    from urllib2 import Request, urlopen


class GitClones:
    def __init__(self, user):
        self.url = "https://github.com/%s" % user
        self.url += "?&tab=repositories&q=&type=source"
        self.git_clone = "git clone https://github.com/%s/" % user
        self.git_clone += "%s.git"
        self.user = user

    def http_get(self):
        if version_info[0] != 2:
            req = urlopen(self.url)
            return req.read().decode('utf-8')
        req = Request(self.url)
        request = urlopen(req)
        return request.read()

    def get_repo_data(self):
        try:
            response = self.http_get()
        except Exception:
            print("Unable to make request to %s's Github page" % self.user)
            exit(1)
        else:
            pattern = r"<a\s?href\W+%s/(.*)\"\s+" % self.user
            for line in findall(pattern, response):
                yield line.split('\"')[0]

    def get_repositories(self):
        return [repo for repo in self.get_repo_data()]

    def download(self, git_repos):
        for git in git_repos:
            cmd = self.git_clone % git
            try:
                call(cmd.split())
            except Exception as e:
                print(e)
                print('unable to download:%s\n ' % git)


if __name__ == "__main__":
    message = 'Usage: python git_clones.py -u <github username>'
    h = 'Github Username'
    parser = ArgumentParser(description=message)
    parser.add_argument('-u', '--user', required=True, help=h)
    d = parser.parse_args()
    clones = GitClones(d.user)
    repositories = clones.get_repositories()
    clones.download(repositories)



RE: Git clone all of a Github user's public repositories (download all repositories) - buran - Jul-10-2019

just to mention that given there is an API, it's better to make use of it. It would be better on so many counts...


RE: Git clone all of a Github user's public repositories (download all repositories) - Skaperen - Jul-10-2019

nice to know there is an API. they don't do much to let people know that's there.


RE: Git clone all of a Github user's public repositories (download all repositories) - rootVIII - Jul-11-2019

Yup I had no idea

I can also guarantee that it was more fun writing the code than making API calls :)


RE: Git clone all of a Github user's public repositories (download all repositories) - buran - Jul-13-2019

(Jul-11-2019, 05:27 AM)rootVIII Wrote: I can also guarantee that it was more fun writing the code than making API calls :)
You can write a code that uses API calls. Or even write a python wrapper (although there are many available)


RE: Git clone all of a Github user's public repositories (download all repositories) - Skaperen - Jul-17-2019

i am now wondering why not:
        self.git_clone = "git clone https://github.com/%s/%%s" % user
instead of lines 21-22 in post #14.


RE: Git clone all of a Github user's public repositories (download all repositories) - rootVIII - Jul-20-2019

I wanted to leave that last %s for the loop/filling in the repository name. But to be honest I didn't know that format specifier existed. Pretty neat.