08 minute read

How I’m generating a really unique client ID for every user

Creating a unique number for a user that is publicly shareable without compromising sensitive information can be challenging.

The term client_id may have already given away what I’m working on, but if that’s not the case, it’s Google Analytics.

In the context of Predifix—a condo management app that’s a side project of mine—the Head of Marketing, also known as my wife, wanted to track a few things in the app like when a user signs up, when a subscription is canceled, created, updated, etc. All of these happen in the server, so the JavaScript implementation of Google Analytics wouldn’t help here.

So I created a GoogleAnalyticsService class in Ruby where I’m leveraging the Google Analytics Measurement Protocol API, which we can use to send events to GA. Perfect for what I needed, since I won’t have client JavaScript code when a subscription get’s automatically canceled, for instance—it’s a webhook that calls this function.

The algorithm

Context aside, the way I ended up doing this assures that no two users are going to get the same client_id attached to them. This is accomplished by using a bunch of data in this big number we’re generating.


Oh, now it’s a good time to mention that the client_id that Google expects must have a very specific format, such as 123456.0123456789. Digits only, a dot after the first 6 digits, for a total of 16 digits.


So I ended up using a 6 digit random number, a 6 digit user ID and a 10 digit timestamp. If your math is correct you’ll notice that this would give you a 22 digit, not the 16 we’re looking for.

The missing part is that the 6 digit random number and the 6 digit user ID get summed up together, and then forced again into a 6 digit number.

The code

Ruby
12345678910111213141516
def generate_unique_id(user_id)
  # Get a 10 digit timestamp
  timestamp = Time.now.to_i

  # Generate a random 6 digit number
  random_part = format("%06d", rand(0..999999))

  # Transform the user_id argument into a 6 digit number
  user_id_part = format("%06d", user_id % 1_000_000)

  # Combine user_id with the generated random number
  combined_part = (random_part.to_i + user_id_part.to_i) % 1_000_000

  # Returned value in the format of 123456.0123456789
  "#{format("%06d", combined_part)}.#{timestamp}"
end

Let’s go over a few examples to see how this thing actually works.

The timestamp is 1755953349.
The random_part is 651270.
The user_id is 64, so the user_id_part would be 000064.
Lastly, the combined_part where we mix the random_part with the user_id_part would be 651322.
So the returned value would be 651322.1755953349.

There you have it! Your unique client_id!

Edge cases

Since we’re using the user_id—and I didn’t mention this before but this actually comes from the database and it’s incremental—we can assure there will be no two users with the same user_id.

Although that’s true, since we’re reformatting the user_id_part to a 6 digit number, as soon as we reach 1 million users in the database, we will start to have “duplicates”. Not really, but let’s go over it.

If the user_id is going to be something like 1200321, the user_id_part is going to end up as 200321 and this would be an already existing user_id. Well, that’s one thing.
Another relevant aspect is the fact that we’re mixing it up with a 6 digit random number. So the chances of ending up with an actual duplicate client_id are slim.

Besides, we’re also adding a timestamp to the mix, which will inevitably solve any possible duplicates. Unless you’re expecting hundreds of thousands of users signing up to your app at the exact same time. And even if that was the case, the server won’t process all those requests at the exact same time.
Again, very slim chances of getting the exact same client_id, but it’s there.

Considerations

In this case, I got away with only generating a client_id only when the user signs up. But if for some reason you need to generate one as soon as the user lands on your app or website, you won’t be able to rely on an incremental number coming from your database, the user_id.

Before clarifying this need with the marketing department, I was going for this scenario, and I would probably end up relying on the user’s IP address as a somewhat unique number.

And in this case, the timestamp would also have a second use. You see, when the user signs up, I know the date when that happened through the created_at date in the users table. But if you don’t have a user to begin with, as they haven’t signed up yet, the timestamp in their client_id would also tell your when was the first contact they had with your app or website. Might be useful for your marketing needs!

One last point is that in the scenario of a client_id generated before a user sign up you will likely need to rely on a cookie of some sort, as the user will need to carry that ID throughout their journey up until they sign up. Which also means you will need to inject it into the signup form, somehow—which may or may not be complex depending on which domains you’re using. And obviously you will need to decide what to do with that cookie once the user signs up. When that happens the client_id will be accessible through the user record in the database, so you effectively don’t need it in the client side anymore.
Should you clean it, or just leave it there? Up to you to consider.

A last point after the last point regarding the client side. If you also have a client implementation of Google Analytics, through JavaScript, chances are you will also need the client_id in the client. The way I’m doing this is by attaching it to the session object and then expose it in a <body> attribute so that I can fetch it before initializing Google Analytics through JavaScript.

Bottom line

That’s it, now you know how to generate an actually unique ID for a user, without revealing sensitive data such as the user_id from the database record or even worse, some personal identifiable information.

Gotta love these small projects inside a big project!

Photo of Pedro