Infinite scale with Ruby on Rails and PostgreSQL sharding

Every rails app with enough growth will eventually hit the point where the database can no longer be vertically scaled.

A common solution is to add one or more follower databases to alleviate stress caused by too many reads. This does help, but with enough growth, you will eventually see your indexes run out of ram. You may also see your CPU, disk or network resources exhausted from too many writes.

Horizontal sharding

Horizontal sharding with multiple write databases can use any number of small affordable databases to handle virtually any scale you throw at it, and I'll explain how you can implement it with Ruby on Rails 6.1 and PostgreSQL.

Sharding by UUID

UUIDs work great with horizontal sharding. You will never have a key conflict between your shards. The uniqueness of a UUID can also be used to determine which physical shard a record should live on.

A UUID is a 128 bit number. Normally you see UUIDs represented as 16 bit strings like d4ff2862-e12e-4702-8ea7-c436b07f700f, but they can also be represented as 128 bits or as a number.

Here we convert the UUID into a number using to_i with a radix of 16 since the UUID is in base 16 (0-9 and a-f).

'230d30d2-296d-4d60-8064-91204aa3fe1c'.tr('-', '').to_i(16)
# => 283121189935762691698302501111749111823

The tr method is used to strip out the dashes and to_i(16) converts the base 16 string to an integer.

'230d30d2-296d-4d60-8064-91204aa3fe1c'.tr('-', '').to_i(16) % 16 + 1
# => 12

If you divide by your number of shards, let's say 16 shards for this example, and check the remainder using the modulo operator, you'll get a number between 0 and 15. In this example using UUIDs belonging to users, this user would live on shard 12. The random bit nature of a UUID ensures users will be equally distributed across the shards.

Doubling the number of shards

If you need a larger number of shards down the road, you can increase the divisor by the number of shards you need. Let's pretend like we increased to 32 shards.

'230d30d2-296d-4d60-8064-91204aa3fe1c'.tr('-', '').to_i(16) % 32 + 1
# => 28

This id would now live on shard 28. 28 is derived from shard 12 + 16. Half the data on shard 12 would stay on 12, and the other half moves 16 shards over to shard 28.

Half of shard 1's data will be moved to shard 17, half of shard 2's data will be moved to shard 18 ... and half of shard 16's data will be moved to shard 32.

Sharding your data

Depending on your application, you might have a clear main table to shard on other than users. Usually there is one table that contains the majority of data, or one table that associates to the majority of data. Most apps get sharded across either users, time series data like events, or SAAS level objects like stores or organizations.

If you have tables that do not associate to the main shard key in any way, it would be worth looking into having a primary database that is not sharded. Check out the rails guide on multiple databases to learn how to set up a primary database on ApplicationRecord then use a ShardRecord that holds all of your sharded data.

Let's get started by setting up ActiveRecord to handle connection swapping.

Enable Rails 6.1 connection handling

This example is using PostgreSQL and requires Ruby on Rails 6.1 or greater. Rails 6.1 introduced better native sharding functionality and connection swapping.

First, update your application.rb to use the newest version of connection handling.

# config/application.rb

config.active_record.legacy_connection_handling = false

If you are using a version older than 6.1, you'll need to upgrade or alter this tutorial to use something like https://github.com/thiagopradi/octopus to handle connection swapping and connection pooling.

Switch primary keys to UUIDs

Next let's enable UUIDs. Simply create a new migration, enable the extension and create/migrate the database.

In your terminal run:

rails g migration enable_uuids

Then add this line to your new migration:

enable_extension 'pgcrypto'

You'll also want to update the Rails generators to force all new tables to also use UUIDs for their primary key.

# config/intitializers/generators.rb

Rails.application.config.generators do |generator|
  generator.orm :active_record, primary_key_type: :uuid
end

Add multiple databases

Add multiple shards to your development, test and production environments. Take note of the new nested syntax. There is an extra nested layer under development for the shard name.

You can add as many databases as you need, but keep it a factor of 2 (2, 4, 8, 16, 32, 64, etc). Factors of 2 make it easier to add more databases down the road.

# config/database.yml

development:
  shard_1:
    <<: *default
    database: myapp_development_1
  shard_2:
    <<: *default
    database: myapp_development_2
  shard_3:
    <<: *default
    database: myapp_development_3
  shard_4:
    <<: *default
    database: myapp_development_4

Then create and migrate your databases:

rake db:create
rake db:migrate

Rails 6.1 handles the creation and migration of each shard automatically. You should see in the logs that 4 databases were created.

Connecting to the shards

For each shard, you will need to tell ActiveRecord about the shards using the connects_to method.

We store the newly created connections in a constant so they can be easily accessed in our Shard service which we will create in a moment.

# app/models/ApplicationRecord

class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true

  SHARDS = connects_to shards: {
    shard_1: { writing: :shard_1 },
    shard_2: { writing: :shard_2 },
    shard_3: { writing: :shard_3 },
    shard_4: { writing: :shard_4 },
  }
end

This example is only using "writing" databases, but you could also add a "replica" database for each shard. Check out the Rails 6.1 docs on replicas for more info.

Swapping connections between shards

Let's create a Shard service that can determine the physical shard key from a UUID, swap connections to the correct database, and run queries on all the shards at once.

# app/services/Shard.rb

module Shard
  def self.physical_shard_key(uuid)
    shard_number = uuid.tr('-', '').to_i(16) % ApplicationRecord::SHARDS.count
    "shard_#{shard_number + 1}".to_sym
  end

  def self.use(shard)
    ApplicationRecord.connected_to(shard: shard, role: :writing) do
      yield
    end
  end

  def self.for(uuid)
    use(physical_shard_key(uuid)) do
      yield
    end
  end

  def self.all
    results = []
    threads = []

    ApplicationRecord::SHARDS.each do |shard_connection|
      shard_name = shard_connection.db_config.name.to_sym

      threads << Thread.new do
        results << use(shard_name) do
          yield
        end
      end
    end

    threads.each(&:join)
    results.flatten
  end
end

Breaking this down, the physical_shard_key determines which shard to use based on the UUID of the primary table you are sharding on.

The use and for methods use the new Rails 6.1 connected_to method to swap the database connection to the shard of your choice. Then the all method uses threading to run a block of code connected to every database in parallel then collate the results.

Example usage

To run database queries on the right shard, you need to know the primary key of the record you want to interact with. In our example that is sharded on the users table, we always need to set the shard using a user's primary key.

The most common place to find this ID prior to database access is from your authentication code, usually a cookie that stores information about the current user.

We can use Shard.for to swap to the correct shard using a user ID.

current_user = Shard.for(user_id) { User.find(user_id) }

Or find associated data.

photos = Shard.for(user.id) { user.photos }

You can easily update any records on the shard.

Shard.for(user.id) { user.update(name: 'John') }

If we are creating a new user, we need to first create a random UUID so we know which shard to store the data on.

user = User.new(id: SecureRandom.uuid)

Shard.for(user.id) { user.save }

We can easily save data associated with the user on the same shard.

Shard.for(user.id) { user.comments.create(message: message) }

This can be associated as deep as you want, as long as it was saved associated with the same user.

like_count = Shard.for(user.id) { user.comments.first.likes.count }

So what if a record is shared by multiple users across multiple shards? Use the all method to search every database. Using threads, the requests run concurrently since the GIL does not prevent I/O from running concurrently. This means you can request and collate the results from all shards at similar speeds as requesting from one shard.

friend_ids = Shard.for(user_id) { User.find(user_id).friend_ids }
friends = Shard.all { User.where(id: friend_ids) }

Your table reference in this example should not have a foreign key constraint, since constraints don't work across shards.

Another example would be to count a table in each database.

Shard.all { Photo.count }
# => [52324, 45623, 54323, 54362]

If you ever exceed your database connections, you will need to use a tool like pgbouncer to consolidate connections.

Adding more shards

If you need to add more shards, it should be done by factors of 2. 2 shards to 4, 4 shards to 8, 8 shards to 16, etc.

Start by creating a "reading" database for each shard.

After all the "reading" databases are caught up, go into maintenance mode, update the "reading" databases to be "writing" databases and add them to the database.yml in the correct order. Udate the ApplicationRecord with the new shards and leave maintenance mode.

The correct order for going from 4 shards to 8 would look like this:

shard 1
shard 2
shard 3
shard 4
shard 1 follower
shard 2 follower
shard 3 follower
shard 4 follower

This order is determined by the math of the modulus function pointing to which shard an ID should live on.

After adding new shards, each shard will have half of it's data that doesn't belong there any more. Simply iterate through the user's, determine if the user belongs in that database using the Shard.physical_shard_key method, then delete it and it's associated data if it does not.

Wrapping it up

These examples open up a world where your Rails application is no longer bound by your database server limits. There is clearly an extra level of discipline that needs to be added around database requests, but with great power comes great responsibility. I hope this helps and makes scaling beyond billions of records as easy as it was for me.

Published Sep 1, 2020

Engineering leader who is passionate about getting small teams to do great things.Corey Stout on LinkedIn