Intelligent Archive: Budgerigar Version

7 Jul 2011 - 00:00


The Intelligent Archive program is a Java based piece of software used for text analysis within the University of Newcastle’s Centre for Literary and Linguistic Computing (CLLC). The software is used in various different ways by the Centre researchers who are focusing on different aspects of text analysis. The typical CLLC project involves preparing a set of texts for computational stylistics operations, with the ultimate purpose of determining authorship of a disputed literary work, or analysing the style of a work or group of works. The IA serves

these projects by organising sets of texts and making word counts which can be exported for analysis in an external spreadsheet or statistics program. It is an interface to an archive of texts, and incorporates a range of counting functionalities which can be determined by the user, hence is an 'intelligent archive'. While most text-processing programs focus on more linguistic outputs, such as concordances, or lists of the commonest collocates of a given word, the IA's primary function is more statistical, centred on producing frequency counts of words.


The Intelligent Archive Budgerigar software currently provides the following core facilities:


  • Management of individual texts of different formats within a virtual library or repository
  • Management of text sets, which are user-created groups of these texts
  • Word frequency analysis on individual texts, tagged sections within texts, text sets, contiguous block segments of a specified size within texts, etc.

The IA is available free of charge from the website cited above.