Move all data fields into an 'impl' struct (pimpl idiom) so that move()ing
a packet becomes very cheap. The downside is that we need an extra
allocation, but we can later recover that by placing the fragment array
in the same structure.
Even with the extra allocation, performance is up ~10%.