Git clone all of a Github user's public repositories (download all repositories) - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: General (https://python-forum.io/forum-1.html) +--- Forum: Code sharing (https://python-forum.io/forum-5.html) +--- Thread: Git clone all of a Github user's public repositories (download all repositories) (/thread-19336.html) |
RE: Git clone all of a Github user's public repositories (download all repositories) - rootVIII - Jul-04-2019 Axel your gists are awesome. So I was originally scraping a line that contained the phrase "repository_nwo" to identify lines with repo names. For some reason that line doesn't show up in the HTML of everyone's repository page. I thought it might have to do with being logged in vs. logged out... but no. So now it just looks for each link in the html. Returning as a set may not be necessary but left it just in case. My apologies again. Also I might add a check in the future so that it doesn't attempt to download forked repositories RE: Git clone all of a Github user's public repositories (download all repositories) - Axel_Erfurt - Jul-04-2019 Thanks, it works now. RE: Git clone all of a Github user's public repositories (download all repositories) - Skaperen - Jul-05-2019 (Jul-04-2019, 06:15 PM)rootVIII Wrote: Yup it's the regex... should have a fix in a little bit... sorry for thatugh... those "unanticipated requirements", again. RE: Git clone all of a Github user's public repositories (download all repositories) - rootVIII - Jul-05-2019 Hahaha I know. Sometimes I get too excited and post the code to early. Anyways, this one will only attempt to download the user's own repos and not forks. I changed the URL to display only source repos. It might be fun to get the forked ones too though. # rootVIII # Download/clone all of a user's public source repositories # Pass the Github user's username with the -u option # Usage: python git_clones.py -u <github username> # Example: python git_clones.py -u rootVIII # from argparse import ArgumentParser from sys import exit, version_info from re import findall from subprocess import call try: from urllib.request import urlopen except ImportError: from urllib2 import Request, urlopen class GitClones: def __init__(self, user): self.url = "https://github.com/%s" % user self.url += "?&tab=repositories&q=&type=source" self.git_clone = "git clone https://github.com/%s/" % user self.git_clone += "%s.git" self.user = user def http_get(self): if version_info[0] != 2: req = urlopen(self.url) return req.read().decode('utf-8') req = Request(self.url) request = urlopen(req) return request.read() def get_repo_data(self): try: response = self.http_get() except Exception: print("Unable to make request to %s's Github page" % self.user) exit(1) else: pattern = r"<a\s?href\W+%s/(.*)\"\s+" % self.user for line in findall(pattern, response): yield line.split('\"')[0] def get_repositories(self): return [repo for repo in self.get_repo_data()] def download(self, git_repos): for git in git_repos: cmd = self.git_clone % git try: call(cmd.split()) except Exception as e: print(e) print('unable to download:%s\n ' % git) if __name__ == "__main__": message = 'Usage: python git_clones.py -u <github username>' h = 'Github Username' parser = ArgumentParser(description=message) parser.add_argument('-u', '--user', required=True, help=h) d = parser.parse_args() clones = GitClones(d.user) repositories = clones.get_repositories() clones.download(repositories) RE: Git clone all of a Github user's public repositories (download all repositories) - buran - Jul-10-2019 just to mention that given there is an API, it's better to make use of it. It would be better on so many counts... RE: Git clone all of a Github user's public repositories (download all repositories) - Skaperen - Jul-10-2019 nice to know there is an API. they don't do much to let people know that's there. RE: Git clone all of a Github user's public repositories (download all repositories) - rootVIII - Jul-11-2019 Yup I had no idea I can also guarantee that it was more fun writing the code than making API calls :) RE: Git clone all of a Github user's public repositories (download all repositories) - buran - Jul-13-2019 (Jul-11-2019, 05:27 AM)rootVIII Wrote: I can also guarantee that it was more fun writing the code than making API calls :)You can write a code that uses API calls. Or even write a python wrapper (although there are many available) RE: Git clone all of a Github user's public repositories (download all repositories) - Skaperen - Jul-17-2019 i am now wondering why not: self.git_clone = "git clone https://github.com/%s/%%s" % userinstead of lines 21-22 in post #14. RE: Git clone all of a Github user's public repositories (download all repositories) - rootVIII - Jul-20-2019 I wanted to leave that last %s for the loop/filling in the repository name. But to be honest I didn't know that format specifier existed. Pretty neat. |