Have you ever found yourself in a situation where you have to process a tons of requests, but unfortunately the processing speed of your machine is less than the volume of requests.
This is a dreadful situation because sooner or later, your service is going to get stalled or the requests will start getting time-out. This is exactly the kind of situation where you can make use of something called message queue.
With message queue, you simply take the incoming requests and put them in a queue. Then your service can pull these requests from the queue one-by-one and process it in its own time.
Table of Contents
What is Pubsub?
Well, Pub/Sub is an implementation of a Publisher/Subscriber design. And Cloud Pub/Sub is a Google product that brings the flexibility and reliability of enterprise message-oriented middleware to the cloud. In short, Cloud Pubsub provides many-to-many, asynchronous messaging that decouples senders and receivers, it allows for secure and highly available communication among independently written applications.
The most important part of the definition is that “Pub/Sub decouples the Senders (Publishers of the messages) from the Receivers (Subscribers of the messages)”. This means that two services can talk to each other using a common language, without having any kind of knowledge about the other.
This decoupling of services makes the communication more secure and requires fewer resources from their end even when dealing with a lot of requests. This makes Cloud Pub/Sub a really good way to create highly available communication between different services.
So, next time if someone asks you what is Pub/Sub, tell them it is just a message queue that decouples senders from receivers.
How does Pub/Sub work?
This is a very simple and elegant way to communicate between two parties. Let’s take a simple example- Suppose you visit an e-commerce store to buy a product. And you found that the product is not yet available and there is an option to subscribe for notification. What this option does it to take your email address and save it in its database under the category of subscribers so that whenever the product you subscribed is launched, you would get the notification.
The pub/sub model works exactly the same way.
There is a publisher and there are multiple subscribers. The publisher sends a message to a particular topic and all the subscribers who have opted to get a notification on that topic will receive the data.
Simple. And. Elegant. 🙂
I think the Pub/sub model is pretty clear now, so let’s get a little deeper and understand how it simplifies the communication.
Publisher/Subscriber model is part of Message Queue paradigm. Message Queue simplifies the communication because now the sender and receiver need not process synchronously. The receiver can process messages at its own pace and sender can send messages at its own pace.
The sender no longer has to have the knowledge of the receiver. The sender can now send the messages at its own pace to intermediate message queue. And the receiver can process the message one-by-one at its own pace. No coupling required.
Once the receiver is done with the processing, it sends the acknowledgement back to the intermediate message queue so that the message queue can delete the processed message from the queue. Here is a short article about the working of the Message Queue: Click to Read.
In short, the Publisher/Subscriber model is an asynchronous decoupled way of communicating between two services.
Where to Use Cloud Pub/Sub?
Let’s list down the common use cases for the Pub/Sub model. These use cases are not restricted to Cloud Pub/Sub. Any publisher/subscriber model is capable of doing below tasks. Here are some of the most common use cases of Pub/Sub model.
Common use cases
- Balancing workloads in network clusters. For example, a large queue of tasks can be efficiently distributed among multiple workers, such as Google Compute Engine instances.
- Implementing asynchronous workflows. For example, an order processing application can place an order on a topic, from which it can be processed by one or more workers.
- Distributing event notifications. For example, a service that accepts user signups can send notifications whenever a new user registers, and downstream services can subscribe to receive notifications of the event.
- Refreshing distributed caches. For example, an application can publish invalidation events to update the IDs of objects that have changed.
- Logging to multiple systems. For example, a compute instance can write logs to the monitoring system, to a database for later querying, and so on.
- Data streaming from various processes or devices. For example, a residential sensor can stream data to backend servers hosted in the cloud.
- Reliability improvement. For example, a single-zone computing service can operate in additional zones by subscribing to a common topic, to recover from failures in a zone or region.
I’ve personally used it for Implementing asynchronous workflows and Distributing event notifications.
Benefits of using Google Cloud Pub/Sub
There are many benefits of using Google Cloud Pub/Sub.
Use as a Service
The main advantage of using the Google Cloud Pub/Sub is that you do not have to worry about anything. From maintaining the queue to sending the data to the subscribed users have been taken care of by Google itself.
You just have to create topics and register subscribers.
The publisher service will send message to the topic and everything beyond that will be handled by the Google Cloud Pub/Sub. That is maintaining the queue, sending the message to the subscriber, getting the acknowledgement etc etc…
Auto adjust message delivery
This is another useful feature that you get out-of-box when you use Google Cloud Pub/Sub. It automatically matches the processing speed of the subscribers and send message accordingly.
Suppose it takes subscriber 2 seconds to process a message, the Cloud Pub/Sub will automatically adjust the delivery speed so that the receiving service is not over loaded with requests.
This feature comes pretty when you have to process the bulk messages reliably.
Also, it makes sure to keep sending the message again and again until the subscriber has rightfully acknowledged it. Pub/Sub has an option to retain the messages for as long as 7 days. It will keep on sending the message for 7 days until the message has been acknowledged by the subscriber.
Trigger on Events
The Cloud Pub/Sub can be configured to listen to particular events and trigger the subscribers.
The event could be uploading a file to cloud storage. As soon as you upload the file to Cloud Storage Bucket, a finalize event is fired. Cloud Pub/Sub can be configured to listen for such events on a particular bucket, so that as soon as the file is landed on the bucket, it could be picked up for processing. All of this happens asynchronously in the backend.
There are a lot many benefits of using Google Cloud Pub/Sub. I have listed a few above. And ofcourse, you get the reliability and scalability of Google. As Google uses the same infrastructure that it provides to its consumers. If any problem arises, it would be found and fixed by Google itself before it could reach to you. So you can rely on Google for that.
Conclusion
Google Cloud Pub/Sub is a really good solution for Bulk data processing, distributed events notification, asynchronous workflows and reliability improvement.
Cloud Pub/Sub can be integrated with any architecture with minimum effort. And on top of that, it is pretty low priced.
I would suggest you to take it for a spin and do not forget to comment your opinions belows.