I tried our lw_shared_ptr, the compiler complained endless usage of
incomplete type stream_session. I can not include stream_session.hh
everywhere due to circular dependency.
For now, I'm using std::shared_ptr which works fine.
Each outgoing_file_message might contain multiple mutations. Send them
one mutation per RPC call (using frozen_mutation), instead of one big
outgoing_file_message per one RPC call.