State-of-art automatic analysis tools for personal audio content management are discussed in this paper. Bayesian networks based audio classification algorithm provides classification into four main audio classes and serves as a first step for other subsequent analysis tools. For speech analysis we propose an improved BIC based speaker segmentation and clustering algorithm and a combined gender and emotion detection algorithm utilizing prosodic features. For the other main classes it is often hard to device any general and well functional pre-categorization that would fit the unforeseeable types of user recorded data. For compensating the absence of analysis tools for these classes we propose the use of efficient audio similarity measure and query-by-example algorithm with database clustering capabilities. Based on the experiments the audio similarity framework is also capable of producing relationship metadata for example relating the labeled speaker segments of one sample across the whole user's personal database. By following some simple combined implementation principles the framework can be supported also in personal mobile devices. The experimental results show that the combined use of the algorithms is feasible in practice.