TLV - An XML/JSON Alternative

2/5/2011

Introduction

The problem I've been trying to solve is to find a data format that lets me send any kind of data over a network connection. Transmission needs to be efficient, as networks are often unreliable and slow. I also want to process data before sending it - specifically, compression and encryption. XML and JSON are both really popular and useful ways to represent data. Each have benefits - some of which are shared. Each also have shortcomings, which make them less than ideal as a format for transmission over a network.

The biggest issue I have with XML is that it's chatty. It bundles a ton of meta data with the data itself. Whilst I have other problems with XML (namespaces, the role of CDATA, DTDs), that in itself discounts it.

JSON also has limitations. It only supports a small number of data types, so no dates or byte arrays (BSON supports byte arrays and dates, but not universal numbers). Most uses of JSON have an implicit schema, which adds to the amount of data being transferred. Granted, not nearly as much as XML, but still more than is required when both ends of an exchange already know what they're exchanging.

This led me to TLV (Type, Length, Value) - a format that can encode virtually any data. As its name implies, TLV elements consist of three fields -

  • Type - A code which indicates the kind of field that an element represents.
  • Length - The size of the value field.
  • Value - A variable-sized series of data elements which contains data for a TLV element.

There's flexibility in TLV. The Type field could, for example, be used to identify a data type (string, integer, date, and so on). It could also be used to identify a field type, such as Customer Id or Customer Name. It is as you wish. This flexibility is what gives TLV another advantage over formats like XML and JSON - beyond the Type field, TLV does not include any field meta data, resulting in smaller data payloads.

Multiple TLV elements can be assembled as a sequence, forming a record, or row of data. TLV can do nesting, so a sequence of TLV elements can itself be represented as a single TLV element. Some advantages of using a TLV representation:

  • TLV sequences are easily searched using generalized parsing functions.
  • Unknown TLV elements can be skipped, and the rest of the sequence parsed with no ill effects. This is similar to the way that unknown XML tags can be safely skipped.
  • TLV elements can be placed in any order inside the sequence.
  • TLV is ideal for binary data, which makes parsing faster and the data smaller.
  • TLV doesn't require libraries, and it's quick to implement. I was transmitting TLV in less than a day (see below).

As you can imagine, TVL is no silver bullet either. It has baggage:

  • Data in TLV elements is represented using a single type, such as character or byte arrays. It must be converted from and to it's native format when serialized or deserialized.
  • TLV implemented using byte arrays isn't human-readable, in the way that both XML and JSON are.
  • It isn't self-describing in the way that XML is. Both parties in an exchange of a TVL sequence must have prior knowledge of element types.
  • Nesting must similarly be understood by both parties.

These are, happily, not something that affects the way I use TLV. I've no need to see the contents of what's transmitted (encryption will kill that benefit off anyway). I use TLV to exchange data between clients and servers that form part of the same system, so I use the same code to serialize and deserialize TLV on both client and server.

Implementation

By way of example, we'll write an application that proves the concept simply by serializing a list of customers to both a TLV element and a byte array, and desrializing back to the customer list. The sample app is a Windows Forms application written in Visual Basic.NET, and consists of the following:

TLV Sample app Solution Explorer

Tlv.vb (TLV Element)

The following code listing shows a class describing a TLV element, and how it might encode and decode a byte array (click the little +/- signs to expand and collapse code blocks):

 
Option Explicit On
Option Strict On

Public NotInheritable Class Tlv
Expand/Collapse
#Region "Private instance attributes"
 
    Private m_Type As Byte
    Private m_Value As Byte()

#End Region
Expand/Collapse
#Region "Public instance attributes"
 
    Public Property Type() As Byte
        Get
            Return m_Type
        End Get
        Set(ByVal value As Byte)
            If value >= 0 And value <= 255 Then
                m_Type = value
            End If
        End Set
    End Property

    Public ReadOnly Property Length()  As Integer
        Get
            If value Is Nothing Then
                Return 0
            Else
                Return m_Value.Length
            End If
        End Get
    End Property

    Public Property Value() As Byte()
        Get
            Return m_Value
        End Get
        Set(ByVal value As Byte())
            m_Value = value
        End Set
    End Property

#End Region
Expand/Collapse
#Region "Constructors"
 
    ' Default constructor.
    Public Sub New()
        MyBase.New()
    End Sub

    ' Instantiates a TLV element with the specified type and value.
    Public Sub New(ByVal type As Byte, ByVal value() As Byte)
        MyBase.New()
        m_Type = type
        m_Value = value
    End Sub

    ' Decodes the specified byte array into a TLV element.
    Public Sub New(ByVal value() As Byte)
        MyBase.New()
        FromByteArray(value)
    End Sub

#End Region
Expand/Collapse
#Region "Private instance methods"
 
    ' Deserializes a Tlv element from a byte array.
    Private Sub FromByteArray(ByVal bytes() As Byte)

        Dim _Length As Integer = BitConverter.ToInt32(bytes, 1)

        m_Type = bytes(0)
        ReDim m_Value(_Length - 1)
        Array.Copy(bytes, 5, m_Value, 0, _Length)

    End Sub

#End Region
Expand/Collapse
#Region "Public instance methods"
 
    ' Serializes the Tlv element to a byte array.
    Public Function ToByteArray() As Byte()

        Dim _ByteArray(m_Value.Length - 1 + 5) As Byte     ' Length - 1 + (one byte for type + four bytes for length).
        Dim _LengthAsBytes() As Byte = BitConverter.GetBytes(m_Value.Length)

        _ByteArray(0) = m_Type
        _LengthAsBytes.CopyTo(_ByteArray, 1)
        m_Value.CopyTo(_ByteArray, 5)

        Return _ByteArray

    End Sub

#End Region
 
End Class

TlvCollection.vb (TLV Sequence)

Listing 2 constains a class describing a TLV sequence:

 
Option Explicit On
Option Strict On

Imports System.Collections.ObjectModel

Public NotInheritable Class TlvCollection
    Inherits Collection(Of Tlv)
Expand/Collapse
#Region "Constructors"
 
    ' Default constructor.
    Public Sub New()
        MyBase.New()
    End Sub

    ' Decodes the specified byte array into a TLV sequence.
    Public Sub New(ByVal value() As Byte)
        MyBase.New()
        FromByteArray(value)
    End Sub

#End Region
Expand/Collapse
#Region "Private instance methods"
 
    ' Extracts a TLV element from a sequence.
    Private Function ExtractTlv(ByVal byteArray() As Byte, ByVal index As Integer) As Tlv

        Dim _Type As Byte = byteArray(index)
        Dim _Length As Integer = BitConverter.ToInt32(byteArray, index + 1)
        Dim _Value(_Length - 1) As Byte

        Array.Copy(byteArray, index + 5, _Value, 0, _Length)

        Return New Tlv(_Type, _Value)

    End Function

    ' Deserializes a TLV sequence from a byte array.
    Private Sub FromByteArray(ByVal byteArray() As Byte)

        Dim _BytesRead As Integer = 0

        Do While _BytesRead < byteArray.Length
            Dim _Tlv As Tlv = ExtractTlv(byteArray, _BytesRead)
            _BytesRead += _Tlv.Length + 5
            Me.Add(_Tlv)
        Loop

    End Sub

#End Region
Expand/Collapse
#Region "Public instance methods"
 
    ' Serializes the TLV sequence to a byte array.
    Public Function ToByteArray() As Byte()

        Dim _ByteArray() As Byte = {}

        For Each _Tlv As Tlv In Me
            Dim _TlvBytes() As Byte = _Tlv.ToByteArray()
            Dim _ByteIndex As Integer = _ByteArray.Length
            ReDim Preserve _ByteArray(_ByteArray.Length - 1 + _TlvBytes.Length)
            _TlvBytes.CopyTo(_ByteArray, _ByteIndex)
        Next

        Return _ByteArray

    End Function

#End Region
 
End Class

Util.vb (Application-Level constants)

Before we start with customers, we need to define some application-level constants that identify customer entities and customer collections. These constants will be used as values for the Type field in TLV elements:

    Option Explicit On
    Option Strict On

    Public Module Util

        ' TVL Type constants.
        Friend Const ENTITY_CUSTOMER As Byte = 1
        Friend Const COLLECTION_CUSTOMERS As Byte = 2

    End Module

Customer.vb (Customer class)

The Customer class is an entity class that describes a customer. I won't provide the full listing as it's just a bunch of properties and a default constructor:

Customer class

The interesting part is serializing and desrializing a Customer to and from TLV. First we add constants to identity customer fields in the TLV Type attribute:

    ' TLV type constants.
    Private Const CUSTOMER_ID As Byte = 1
    Private Const CUSTOMER_NAME As Byte = 2
    Private Const CUSTOMER_ADDRESS As Byte = 3
    Private Const CUSTOMER_JOIN_DATE As Byte = 4

Serializing a Customer object to TLV now simply requires a TLV element per field, and adding each element to a TLV sequence. Finally, the entire sequence is returned as a single TLV element:

    ' Serialize to a TLV element.
    Public Function ToTlv() As Tlv
    
        Dim _Tlv As Tlv = Nothing
        Dim _Tlvs As TlvCollection = New TlvCollection

        If Not m_Id.Equals(Guid.Empty) Then _Tlvs.Add(New Tlv(CUSTOMER_ID, m_Id.ToByteArray))
        If Not String.IsNullOrWhiteSpace(m_Name) Then _Tlvs.Add(New Tlv(CUSTOMER_NAME, Encoding.UTF8.GetBytes(m_Name)))
        If Not String.IsNullOrWhiteSpace(m_Address) Then _Tlvs.Add(New Tlv(CUSTOMER_ADDRESS, Encoding.UTF8.GetBytes(m_Address)))
        If Date.Compare(m_JoinDate, Date.MinValue) > 0 And _
           Date.Compare(m_JoinDate, Date.MaxValue) < 0 Then _
                _Tlvs.Add(New Tlv(CUSTOMER_JOIN_DATE, BitConverter.GetBytes(m_JoinDate.ToUniversalTime.Ticks)))

        _Tlv = New Tlv(ENTITY_CUSTOMER, _Tlvs.ToByteArray())

        Return _Tlv

    End Function

Deserializing a Customer object back from TLV is accomplished by instantiating the TVL sequence containing the customer fields. We then cycle through the sequence, and assign values to customer properties using a switch statement:

    ' Deserialize from a TLV element.
    Public Sub FromTlv(ByVal value As Tlv)

        If value.Type = ENTITY_CUSTOMER Then

            Dim _Tlvs As TlvCollection = New TlvCollection(value.Value)

            For Each _Tlv As Tlv In _Tlvs
                Select Case _Tlv.Type
                    Case CUSTOMER_ID
                        m_Id = New Guid(_Tlv.Value)
                    Case CUSTOMER_NAME
                        m_Name = Encoding.UTF8.GetString(_Tlv.Value, 0, _Tlv.Length)
                    Case CUSTOMER_ADDRESS
                        m_Address = Encoding.UTF8.GetString(_Tlv.Value, 0, _Tlv.Length)
                    Case CUSTOMER_JOIN_DATE
                        m_JoinDate = New Date(BitConverter.ToInt64(_Tlv.Value, 0)).ToLocalTime
                End Select
            Next

        End If

    End Sub

As a final touch we'll add a second constructor to Customer.vb that accepts a TLV element as a parameter:

    Public Sub New(ByVal value As Tlv)
        MyBase.New()
        If value IsNot Nothing AndAlso value.Length > 0 Then
            If value.Type = ENTITY_CUSTOMER Then
                FromTlv(value)
            End If
        End If
    End Sub

CustomerCollection.vb (Customer list)

The CustomerCollection class contains a list of customers. It inherits from Collection(Of T) where T is Customer:

CustomerCollection class

Serializing customers is as easy as cycling through the list and adding customer TLV elements to a TLV sequence. The sequence is returned as a TLV element with type COLLECTION_CUSTOMERS:

    ' Serialize to a TLV sequence.
    Public Function ToTlv() As Tlv

        Dim _Tlv As Tlv = Nothing
        Dim _Tlvs As TlvCollection = Nothing

        If Me.Count > 0 Then
            _Tlvs = New TlvCollection
            For Each _Customer As Customer In Me.Items
                _Tlvs.Add(New Tlv(_Customer.ToTlv.ToByteArray()))
            Next
            _Tlv = New Tlv(COLLECTION_CUSTOMERS, _Tlvs.ToByteArray)
        End If

        Return _Tlv

    End Function

Deserializing customers is the inverse - cycle through the sequence and deserialize TLV elements into customers:

    ' Deserialize from a TLV sequence.
    Public Sub FromTlv(ByVal value As Tlv)

        If value.Type = COLLECTION_CUSTOMERS Then

            Dim _Tlvs As TlvCollection = New TlvCollection(value.Value)

            For Each _Tlv As Tlv In _Tlvs
                If _Tlv.Type = ENTITY_CUSTOMER Then
                    Me.Add(New Customer(_Tlv))
                End If
            Next

        End If

    End Sub

Home | Blog | Photos | Contact | About

Wittenburg.co.uk and all content copyright 1995-2018 by Michael Wittenburg, unless otherwise stated.
All content on this site is licensed under the Creative Commons license, unless otherwise stated.

Wittenburg.co.uk uses a single session cookie because it's required by the tech underlying the site (Microsoft ASP.NET). The cookie stores no information and seves no functional purpose.