OK, I can surely give you some ideas but as formal languages and the whole theory behind them are a big topic of computer science it will be hard to explain everything here in the forum
First
you should think about the language you want to parse and define a syntax and grammar specifying this language. This requires some knowledge and planning but I think the naive approach without exactly defining the language doesn't scale really well. For any non-trivial language this will end up in a mess, in particular if you want to extend or change your language. Another problem is the "Chomsky type" of a language. This has an important impact on the difficulty of a program to parse such language data. Regarding this difficulty level XML languages are not the best starting point because they are a subset of context-free languages. Regular languages which can be defined with regular expressions are the easier ones.
To avoid problems in general you should separate the process of lexical analysis and semantic analysis (you can easily find all these buzz words with Google

). The lexical or syntax analysis is done by a so-called "lexer" or "scanner", the semantic analysis is the work of the actual parser.
A scanner is a kind of pre-processor for the parser. A scanner can be implemented as a finite state machine. It takes an input stream and just splits it up into "tokens". For example in XML token means special characters like <, > or ", identifiers for element and attribute names etc. In particular the scanner doesn't care about if the input makes sense as long as it contains allowed tokens, i.e. invalid and not well-formed XML is nevertheless valid input for the scanner.
The parser takes the stream of (allowed) tokens from the scanner as its input. So you don't have to take care anymore about syntax errors (like illegal characters, typos and so on) in this step of processing which is why I said you should split up these two steps. I think the easiest implementation to create a parser manually is a recursive descent parser. The job of the parser is to make sense out of the tokens. For the XML example this means that the parser recognizes if the elements are well-formed, i.e. balanced, and reports errors as necessary. During the parsing process you will usally want to create some kind of tree data structure in memory which is named "abstract syntax tree" or in short "AST".
By traversing this AST (like any tree data structure) you can now decide what you actually want "to do" with your parsed language content. This could mean anything from creating a graphic visualization to transforming Java source files into bytecode or directly interpreting the AST of a scripting language source file.
I know this is not the answer you liked to hear but without knowing how much background knowledge you already have, I couldn't come up with a better answer. For any non-trivial example the problem is complex and so will be the solution. Sorry :-) Some may argue that the "naive" approach with simple
pattern matching in strings etc. will work, too, but I suspect this doesn't work any longer as soons as the content/language to parse gets even slightly more complex and requires systematic parsing. To make a long story short, in my opinion if you don't have the required knowledge already you could read up on all the things in this post which you don't know. Otherwise I seriously doubt that you will be able to create a XML parser which works correctly by hand. Of course feel free to ask any question but the topic is really, really to big to explain every detail here. The lengthy text above touches only the surface
Marco