Semistructured data is data that has some structure, but the structure may not be rigid, regular or complete. The structure may also change rapidly or unpredictably. The data does not conform to a fixed schema, and the information normally associated with a schema is contained within the data itself. In some some forms of semistructured data there is no separate schema, while in others there are, but it only places loose constraints on the data. Database management systems (DBMSs) do not handle data of this nature particularly well.

Structured ModelsEdit

Information retrieval based on keywords often assumes flat documents. However, it is possible to make use of the structure of a document, such as giving words in the title higher weight than words elsewhere in the document. An example of a query using knowledge of the structure is finding every document including a page where the string "transaction" is found in the text surrounding a figure with a label including the word "mobile".

Self-Descripbing DataEdit

Self-describing data is data where attribute names are embedded in the data itself, and thus doesn't need any schema to figure out what is what. Markup languages can be used to describe formatting, structure, semantics, attributes etc. using tags; start-tag and end-tag.