Facebook Thrift Tutorial

If you’ve never heard of Facebook Thrift before (or Apache Thrift, now that Facebook has released the source), then you’re missing out on an incredibly useful technology. Thrift is an RPC platform for developing cross-language (also useful for inter-language) services which is far more efficient and far less time-consuming to build than some more traditional approaches. Supported languages include Java, Python, PHP, and a pile of others. If you have heard of Thrift before, then you’ll know that the online documentation sucks, and its fairly difficult to get started.

I’ve had a fair bit of experience with Thrift, and I thought I’d share some of what I’ve learned.

Thrift works by taking a language-independent Thrift definition file and generating source code from it. This definition file may contain data structure definitions, enumerations, constant definitions, exceptions, and service interfaces. A Thrift ‘service’ is the most important part – here, you’ll define what sort of API you’d like to use as a developer, and the Thrift code generation will output all the necessary serialization and transfer code. Lets start with a simple example of a remote logging system in C++.

First of all, we’ll create a Thrift definition file called logger.thrift:

namespace cpp logger

struct LogMessage
{
  1:i64 timestamp,
  2:string message
}

struct LogMessageBatch
{
  1:list<LogMessage> msgs
}

exception LoggingException
{
  1:string msg
}

service Logger
{
  void log (1:LogMessage lm),
  void batch (1:LogMessageBatch lmb),

  LogMessage getLastMessage ()
}

Note that in the above, you can define structures which aggregate other structures. In this case, a LogMessageBatch contains a list of LogMessages. You can find out all the details about these files here. The next step is to generate the C++ code:

thrift –gen cpp logger.thrift

This will generate a new directory called “gen-cpp” which contains various files. There’s a lot of auto-generated code here, but we’ll first take a look at the file called Logger_server.skeleton.cpp:

// This autogenerated skeleton file illustrates how to build a server.
// You should copy it to another filename to avoid overwriting it.

#include "Logger.h"
#include TBinaryProtocol.h>;
#include TSimpleServer.h>;
#include <transport/TServerSocket.h>;
#include <transport/TBufferTransports.h>;

using namespace ::apache::thrift;
using namespace ::apache::thrift::protocol;
using namespace ::apache::thrift::transport;
using namespace ::apache::thrift::server;

using boost::shared_ptr;

using namespace logger;

class LoggerHandler : virtual public LoggerIf {
public:
LoggerHandler() {
// Your initialization goes here
}

void log(const LogMessage &lm) {
// Your implementation goes here
printf("log\n");
}

void batch(const LogMessageBatch &lmb) {
// Your implementation goes here
printf("batch\n");
}

void getLastMessage(LogMessage &_return) {
// Your implementation goes here
printf("getLastMessage\n");
}

};

int main(int argc, char **argv) {
int port = 9090;
shared_ptr<LoggerHandler> handler(new LoggerHandler());
shared_ptr processor(new LoggerProcessor(handler));
shared_ptr serverTransport(new TServerSocket(port));
shared_ptr transportFactory(new TBufferedTransportFactory());
shared_ptr protocolFactory(new TBinaryProtocolFactory());

TSimpleServer server(processor, serverTransport, transportFactory, protocolFactory);
server.serve();
return 0;
}

Those paying attention here will notice that the skeleton code here matches the definitions given in the Thrift definition file. The logic which you need to fill in here inside the LoggerHandler defines how the server-side of your program runs. So here, fill in what you want to do with each incoming log message, throwing exceptions as necessary. This part should be fairly obvious. Also note that incoming Thrift objects are always defined as const reference (in C++, anyway), and functions which are supposed to return a Thrift object do so by storing the relevant information in a non-const reference variable.

If you look inside the main() function at the bottom of the code, you’ll see some boilerplate code to define some of the necessary structures to get a Thrift server running. There are various other classes available to you to run from here as well; for example, instead of a TSimpleServer, you may want to run a server with multiple threads, such as the TThreadPoolServer. Depending on your install path, the options available to you should be somewhere like /usr/local/include/thrift/. Also note that the thrift classes make use of boost shared_ptrs very often.

Now, lets just briefly implement the batch() function from above to give an idea how to interact with these Thrift objects.

void batch(const LogMessageBatch &lmb) {

  std::vector<LogMessage>::const_iterator lmb_iter = lmb.msgs.begin ();
  std::vector<LogMessage>::const_iterator lmb_end = lmb.msgs.end ();

  while (lmb_iter != lmb_end)
  {
    log (*lmb_iter++); // Use the other thrift-defined interface function to write to disk, whatever.
  }
}

What to note here:

  • You access the fields of each Thrift object directly
  • Remember that all the incoming Thrift objects and their fields are const
  • LogMessageBatch contains a ‘list’ of LogMessages in the definition file, but after C++ generation its defined as a ‘vector’. This is just Thrift’s choice of container to represent a list on the generated code end of things. I’m sure other languages share similar oddities.

The rest should be easy. Now, the big question is, how do we use this server? We need to use the generated client-side interface, defined in Logger.h and Logger.cpp. Its called LoggerClient, and in order to use it, it needs to be instantiated with a Thrift protocol pointer. Here’s one way to do it:

#include "Logger.h"
#include TSocket.h>
#include <transport/TBufferTransports.h>
#include <protocol/TBinaryProtocol.h>

using namespace apache::thrift;

namespace logger
{
class LoggerClientInterface
{
private:
  LoggerClient *__client;

public:
LoggerClientInterface ()
{
  boost::shared_ptr<transport::TSocket> socket = boost::shared_ptr<transport::TSocket>
      (new transport::TSocket ("127.0.0.1", 12345));

  boost::shared_ptrTBufferedTransport> transport
      = boost::shared_ptr<transport::TBufferedTransport>
        (new transport::TBufferedTransport (socket));

  boost::shared_ptr<protocol::TBinaryProtocol> protocol = boost::shared_ptr<protocol::TBinaryProtocol>
      (new protocol::TBinaryProtocol (transport));

  transport -> open ();
  __client = new LoggerClient (protocol);
}

void log (LogMessage &lm)
{
  __client -> log (lm);
}
};
}

After initializing the LoggerClient object, you can either use it directly, or wrap its functionality in another class, like LoggerClientInterface::log() above. And you’re done. Remember to include /usr/local/include/thrift and link to -lthrift, and you’re good to go.

Although there’s a fair bit of manual setup to get a thrift service running; its actually a lot of boilerplate copy-and-paste, which you can easily move from one service to the next. After that, its pretty easy once you get the hang of it. Good luck!

About these ads

3 responses to “Facebook Thrift Tutorial

  • Apache Thrift Tutorial – The Sequel « Cvet's Blog

    […] 13, 2010 by michael cvet The other day Ian and I were talking and thought it would be cool to do another Facebook/Apache Thrift tutorial, but this time he’d do the front-end client interface and […]

  • Ferreyra

    Hello Michael,

    I’ve read your post and started to use your samples. Good work.
    Meanwhile, I’m wondering if you can help me clarifiying if thrift can be used to implement a file transfer on-demand? Something like an FTP Server with a get method to retrieve a binary file.

    Thank you,

    NF

    • michael cvet

      NF,

      Glad to hear you found the examples useful.

      I would guess you could just implement a binary file transfer by just dumping the binary file into a thrift string object on the server-side and sending it to the client. Something like:

      struct Document
      {
      1:string name
      2:string content
      3:bool binary
      }

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: