Can a single machine host more than 100,000 projects? Curious, come and learn about the hardware platform of this Git repository

Can a single machine host more than 100,000 projects? Curious, come and learn about the hardware platform of this Git repository

If you want to host your project, consider GitLab.com, where we run a single instance of GitLab. There are currently nearly 20,000 users using this service. There are more than 100,000 projects hosted on a single machine.

Single Server

Previously, GitLab.com was running on Amazon's AWS platform, using the highest configuration instance on AWS. However, due to the continuous growth of users, the single AWS instance could no longer meet our needs, especially the CPU and storage limitations. We had to find an alternative solution.

100,000 repositories require multiple terabytes of storage, so storage capacity is critical. Because we use Git, the storage must be a single file system, not an S3 object storage service like Amazon. We want to be able to easily scale storage. In addition, a large number of people submitting and downloading code also places high CPU requirements on the system, so having more CPU cores helps improve responsiveness under high load.

It seems that the most cost-effective solution is to use your own server. Fortunately, GitLab can be easily run on it.

Therefore, we currently have two independent servers for running GitLab.com, one of which is the active primary server and the other is a backup. The server configuration is as follows:

  • Server model: HP DL180 G6 (manufactured in 2009)

  • Processor: 2x X5690 (24 cores in total)

  • 32GB RAM

  • 12x 2TB HDDs, (two for the root volume using RAID 1, and the other 10 disks using RAID 10 with ext4 filesystem)

We actually only use 16 of the cores.

Failure and failover

Migrating from Amazon meant that we could no longer take advantage of some of the features of the AWS platform, so we needed some failover measures in case of server failure.

We need to use DRBD to create a master-slave server architecture, where only one application server is active at a time, and if there is a problem, DRBD will switch to another server.

Our DRBD tools are available via subscribers.

Future scalability

GitLab.com runs well on its current hardware platform, but it is growing rapidly. Scaling existing hardware is expensive and difficult in some parts.

In the future, GitLab.com will be hosted on Amazon's AWS platform again, which will allow us to easily achieve horizontal expansion. In addition, Amazon has just announced ESB volumes of more than 10TB, which will make our migration easier.

Original English text: The hardware that powers 100,000 git repositories

The hardware that powers 100k git repos

<<:  Summary of AndroidStudio shortcut keys

>>:  Android studio code formatting issues

Recommend

B2B companies, how to operate private domain traffic

This article starts from how B2B uses hot spots t...

Must-know tools for defect tracking, testing, parallel programming, etc.

Defect Tracking 1. Bugzilla This bug tracking sof...

New media marketing, why can't you do it?

When we feel that we are not able to carry out ma...

SaaS product solutions for the fresh fruit industry

The 2020 epidemic accelerated the digitalization ...

Brand promotion: How to do Spring Festival marketing?

It’s the Spring Festival again. Brand owners will...

After this wave of updates, WeChat can finally upload complete original videos?

WeChat has recently launched a wave of incrementa...