Sunday, February 15, 2009

Musings on Natural Language understanding

Quality of editing -Poor Apologies in advance
When I first had a look at the original message of Antti J. Ylikoski I assumed (perhaps unrealistically) that Natural language Understanding was meant since he said "I do not constrain...".
My messages should be viewed in this perspective. If this is not your interest you may skip the rest.
Peter Drucker's dictum or it's equivalent "Look before you leap" or "Take a top down view before getting bogged down in nitty gritty" is useful in view of time constraints for everybody.

Let me start by saying that like others I have gone through generations of Natural language processing tools conceptual and worked through code of others in addition to creating my own each time saying "this is it. Eureka!!", until I was dissuaded by a host of materials the starting point being John F. Sowa's The Challenge of Knowledge Soup" see http://www.jfsowa.com/pubs/. So today I like most others who have spent time recognise that the problems here are currently unsolved and likely to remain so for a long long time.Many actually beleive that an artificial mind is what is needed.

That said it is fascinating and a worthwhile Mind sport and in view of it's use I would be surprised if the biggies-IBMs,Microsoft etc are not at it.

The least we can do is to undestand the limitations of our current systems and How we might overcome them. To understand the limitations is to uncover the assumptions- often unstated that underlie the current systems. In a word make the implicit into explicit. This is more in the nature of a philosophers task and that is the reason why I said back to a philosophy class.
All the messages I happened to look at seemed to take for granted what I would call a language philosophy model=bag of words. and a reasoning philosophy model=predictate logic.
A little elaboration - Nobody wants to use NLP to do only pos tagging and parsing. One wants to use it in practical applications to do that the most common approach is the use of predicate logic or it's variants. So we really have Some assumptions about language. This set of assumptions about the language can be summed up by the term language philosophy model.we view language as a finite set of words and word collocations. words are related to other words words can be categorised allowable word sequences are governed by a grammar
An alternate view of language(language philosophy model=concepts) could be that it is a set of concepts not words. one difference that there is no polysemy here.Bank as in finacial institution is different concept from bank as in the bank of ganges(river) or a bank of radars.You would have different wordnet senses.But in a concept net they would be different concepts.
Similar remarks hold for reasoning philosophy modelIs there an alternative reasoning to predicates well yes. Take a look at http://web.media.mit.edu/~hugo/conceptnet/

He doesn't use predicate logic. I am not endorsing that work. All I am saying is don't shut your eyes to alternate models. Now try and figure out the relevance of NPHard in relation to conceptnet-No predicate logic simple node search. So before you categorise a problem know what the problem is and the terms in which you have described it.


Actually there exist a range of models (models in the sense of engineering/physical science models not tarski type model) depending on the philosophical beliefs embodied in

  • language philosophy model
  • reasoning philosophy model
    Once you are through with this phase and taken position you will want to use the models to capture capture Laws/Rules or regularities of language and logic. The exact terms of reference in the law will depend on the two models as above. Every body knows what these are for the bag of words model.
  • You look for regularity in terms of Noun phrase preceding verb phrase and so on. your terms of refernce will be words,word order,phrases,phrase order..etc.

It is at this point that you might wonder wether to use a rule based search of the corpus or a statistical approach to capture the regularities. Or quite simply generate the rules using volunteers to find the regularities.
I would not venture into the number and quality of the models other than to say well they are there find them time permitting.
Another thing worth mentioning is Analog reasoning and the work of Douglas Hofstadter http://en.wikipedia.org/wiki/Douglas_Hofstadter. I dont particularly beleive in the utility of cognitive science but yes it is worth a dekko.
you may also look at http://formalsystemsphilosophy.blogspot.com/
Some Author specific messages
Wolf K

You are right when you say that "What makes natural languages impossible to formalise is metaphor". But Need we Formalize it in the sense of bottling it in a Formal Logic sytem which was originally designed for a different purpose-Viz the Axiomatization of mathematics."For any given set of words, a variety of collocations are syntactically.."You are absolutely right within what I am referring to as language philosophy model=bag of words. Change the Axioms( I am using the word very loosely for belief in a set of views) and you will arrive at a different conclusion and perhaps a new set of problems. As the above examples show there are alternatives to a statistical description.


Neil W Rickert

"And semantics does not easily map into rules of inference." Reason is that we are doing something we perhaps should not be doing . Semantics in a formal logic system would be defined in terms of the interpretation of the terms and compositionality where as I would rather define sentence as meaningful if it makes sense to a human. "Green Ideas sleep furiously" might not make sense to Noam Chmosky. But it does to me I interpret it as meaning Newly developed ideas lie there sleeping for a long while before being put to use. Poetry,Rhetoric and Metaphor cannot be modelled by formal logic systems.
"ontology is crap". I can understand your frustration. But categorisation thinking in terms of hierarchy are useful tools. I noticed that SUMO puts the same entity at two places in the tree. My Solution was to add a tree basis i.e to use a hierarchy of entities specified not just by entity but by another attribute = tree basis. e.g. powerset(the company) will appear in the tree of Microsoft Subsidaries (=basis) (and Sub Subsidaries). It can also appear in the tree of Industries (basis=Industry). You could add the time element also.
Ian Parker

"mathematics may be defined as the study of formal systems" This is one view and certainly not the current one. see for example http://en.wikipedia.org/wiki/Penelope_MaddyTranslation in general is part of NLU rather than NLP though you could approximate translations using statistical or other techniques.
The word "formal systems" is used by a majority to mean formal logic systems. I prefer to use it more broadly.
I am not too clear on How "it is possible to have a maximum entropy approach whereby grammar and meaning are both present in a "Hamiltonian"."Any way you seem to have missed the main point viz that we have to look beyond grammar and predicate logic- My be I have not communicated effectively.
Lotzi Boloni

Computer Language is a formal system - is not a definition of programming Language. It simply means that the Language can be modelled using Formal Logic which ensures that the programmer is assured of a correct output under all logical circumstances - by simulation of the language on a computer.Abstraction removes such things as font,format etc . You are right when you say that it does not fully model the real world.
Brian MartinI think I agree with you that Semantic modelling is more important and perhaps use Syntax only as an adjunct. Perhaps we could do direct semantic parsing.Semantic Role labelling uses parse as one feature but even that is not essential if an alternate set of features could be put in place.

Sunday, February 8, 2009

Formal Systems,Formal Logic Systems and Natural language processing.

Recently I read a post on the Usenet.
http://groups.google.com/group/comp.ai.nat-lang/t/5855301973b928da?hl=en
about How to formalize a natural language and I felt there are two issues
1.Representing arbitrary set of natural language statements (of finite length-say 10 pages) in a formal system.
2.Solving them on a computer.
Formal System is not equal to Formal Logic System
while Formal systems may use Formal logic they need not do so. To qualify for the attribute Formal it would have to meet certain requirements. See for example http://formalsystemsphilosophy.blogspot.com/ to get a feel for what I mean
Well what does it mean to formalize in the first place?
To formalize is to agree on basic terminology , methods of reasoning, interrelations of terms in the vocabulary/lexicon and generally to have a good model for the stated purpose. Please note that there may be different models on the same domain and serving different purposes or the same purpose with different efficiencies. Models for NLP range from N-Gram models to PCFG to HPSG. While HPSG/LFG may not perhaps be counted as models. Any form of grammar is in essence a model of the language.

To me a model of the language should account for

  • Observed natural language phenomena
  • Be kind to the context-I mean it must be Context Sensitive
  • Account for words, their interrelation ships and correspondence to the physical world.
  • Rephrasing should be possible in a mechanical manner by a human using the model.
  • Linguistic entailment can be worked out using the model.
  • Inferencing using the model and accepted forms of inferencing should solve most if not all problems.

Now I am saying that Humans should be able to work with the model. Computers can follow later if possible.

Is there a good language model?

The answer sadly is a definite no in the sense as above. All models used by the AI/NLP community draw their inspiration from

  • Statistical processing--ngram,Maxent,CRF
  • Statistics with grammar--PCFG
  • Pure Grammar models--HPSG/LFG

Limitations of all Statistical Methods

Machine learning algorithms do not process arbitrary language inputs-they can at best cater to a small subset of the language.
Any machine learning technique has to use statistical processing of one sort or the other and to put it crudely "statistical processing is tossing a coin to decide if the human in front of you is male or female".Before you do number crunching using Maxent or CRF or SNLP you need to know
what numbers you are crunching.At one point of time Statistical parsing was supposed to be the ultimate. And Today people are talk in terms of using it in conjunction with HPSG etc.The remarks apply to all statistical techniques irrespective of the domain. Einstein was supposed to have said "God doesn't play dice" in the context of quantum mechanics.

The only justification for a statistical technique is that a normal law is
subsumed by technique with a probability of 1.
But in general we are trying
to induce the law using some sort of statistical processing where we define
the terms of reference e.g the grammar or the attributes and so on. Goof it
up and you goof up the whole thing. Domain knowledge is more important than statistics or number crunching to figure out the supposed law."Garbage in garbage out" as I learnt ages back from a computer text.

In any form of statistical processing we are doing two things first we are using inductive logic . Second we are specifying the parameters in terms of which to find the law or the statistical model parameters. The parameters represent our belief in a particular sort of formulation of the law. Supposing that we do not know about the ideal gas law we might gather data of pressure , volume and the colour of the gas and get a good correlation but as we all know we might miss out on the actual law. In the context of NLP we need to distinguish between language model and a statistical language model. The former underlies the latter and no amount of refinement of the latter can compensate for deficiencies in the former.To put more concretely a change in grammar will mean a different tree bank or it's equivalent.You may improve on the maximum likelihood of the data but then the data you collect is based on what you believe are the parameters governing the process.

Pure Grammar solutions Like the HPSG/LFG rely on grammar which is another term word order and FOL semantics.

Is there really a grammar as can describe most if not all of the language? Unfortunately we work on the assumption that there is. When you look at the number of rules in the raw treebank-(around 15000) something seems lacking, even removing redundant rules or subsumed rules still leaves you with a big bunch of rules and that is the tip of the proverbial iceberg.

FOL is a subset of all reasoning and is the basis of all semantics take it or leave it. So the sum total of all Methods HPSG/LFG ( and others of their ilk included) can at best represent or model a tiny subset of language.

Whither new Model?

No answers from me but there is only one thing and that I am an optimist and that I hope some clever Philospher/Linguist/NLP whiz will solve the general language understanding problem.

Is Mathematics a Formal Logical system and Can Mathematics represent Natural language?

Basically formal logic was invented to make proofs foolproof. This means one can conjecture a Theorem in maths and prove it using valid reasoning procedure (deductive logic) to the satisfaction of peers.

I claim "Mathematics is not equivalent to a Formal logic System". To justify this I only point out that there are statement in Arithmetic which are true but cannot be proved from a set of axioms.

I believe there might be alternative paradigms to the present FOL plus grammar approach.

Thursday, January 17, 2008

Friday, January 11, 2008

Artificial intelligence

Foreword

There is no dearth of natural language processing and artificial intelligence blogs.



Some promote their companies. Others , mostly students put their new found wisdom on record.Yet others report on their views or report conference proceedings.



My boss once advised me 'sail with the wind'.I said 'Happily , provided it takes me where I want to go.'.So if you know where you want to go, You ask 'o.k , How do I get there?'.


In software or for that matter in any engineering effort one wants to cut out 'reinventing the wheel' as far as possible. But then if the existing wheels don't do what you want , You have little choice. Either you say 'can't do' it or you say 'Let's take the bull by the horns.'.Personally I would do the latter.


We are bringing out in the posts our experiences with various freely available software.


A part of what the software will do is in publications-the intent mostly. Not much is available on 'How well it does in practice'. It's the latter part that is the focus of the various posts.


We hope the blog will be found useful by others who are recent entrants.


stanford-parser-2005-07-21

This is a statistical parser and freely available for playing around with and generally trying to figure out if there is anything useful you could do with it.


We downloaded it a couple of years back and at that point of time it said they preferred a more recent version of java than we had. We downloaded the latest version of java. And checked it out.


On the positive side the lexparser worked fine and we could see a nice parse tree diagram of the test sentence and the other sentences which we tried out. Then we tried using the command line with an input file containing a few sentences and redirecting the TreeBank style parse tree to another file.



It permits just one sentence on the input file. Try adding more and it complained of lack of heap space.


To be fair I must reproduce what it says in the documentation

" To run the PCFG parser on sentences of up to 40 words you need 100 Mb of memory. To be able to handle longer sentences, you need more (to parse sentences up to 100 words, you need 400 Mb). "


We recently tried it out on once again on a machine

It does a lowly 121 words at 2.08 seconds/word. i.e 121 words in about 4.2 min



Now let's take a look at some claims on the state of the art
"In particular, in our Java implementationon a 3GHz processor, it is possible to parse 1600 sentences in less than 900 sec. with an F1 of 91.2%.This compares favorably to the previously best generativelexicalized parser for English (Charniak & Johnson (2005):90.7% in 1300 sec.)."

in Learning and Inference for Hierarchically Split PCFGsby Slav Petrov and Dan Klein

can't vouch for it. But it's wow!