[this is from Newnes’ Pictorial Knowledge, I grew up with this children’s encyclopaedia, published in 1930, second edition 1934. Volume 8 has a QA section with the above priceless title, extracts of which will be added as and when I feel moved to.The text begins as follows]

There are so many wonderful things in this world of ours that most of us, whether we be young or not so young, can think of countless questions we would like to ask about them if only we could find someone who would answer them for us. In the pages that follow are hundreds of such questions and each one is answered clearly and simply, in a form that each one of us can understand.



Why does a Balloon rise?

Because it is lighter than the body of air which it displaces. It is forced upwards by the difference between the upward and downward pressure of the air on it.

When did a man first ascend in a Balloon?

On November 21st, 1783, when two Frenchmen rose in a huge fire-balloon from Paris. The balloon attained a height of 3,000 feet, and came to earth about two miles from the starting point. The first ascent in a gas balloon was made at Paris on December 1st, 1783, by Professor Charles, who used hydrogen gas. He rose after sunset to a height of 3,000 feet, and that elevation was the first man to see the sun set a second time on the same day.


What is meant by a Fascist?

A member of the Italian society pledged to oppose Communism, and founded in 1919 by Signor Benito Mussolini. The word Fascist is connected with the fasces, or bundles of sticks carried in olden times by Roman lictors as emblems of their authority, and adopted by the Fascisti (pl. of Fascist) themselves. The Fascisti wear a black shirt as their uniform, and are spoken of as Blackshirts. The movement started by Mussolini spread all over Italy, quelled the revolutionaries who were trying to upset everything, and in the end made Mussolini practically the dictator of Italy. Societies on the same lines as that of the Fascisti have been formed in some other countries. [I have an idea which countries they mean. CB]

When was the Air Mail Service to India begun?

In March, 1929.

[more to come]


Posted by: Chris Brew | August 16, 2013

Measuring SVG text in the IPython notebook.

I spent a little time learning how to use the IPython notebook to get reliable information about the height and width of SVG text. This is tricky enough that it is worth putting out, because I’m sure it is a common need. Here are before and after images of the text.

See http://nbviewer.ipython.org/6248510

The point of this is to enable good display of linguistic objects such as trees and dependency graphs, similar to what Sebastian Riedel has done with WhatsWrongWithMyNLP. Watch this space. Currently, the results look good to me in the IPython notebook under Safari. Comments and advice more than welcome.

Posted by: Chris Brew | June 27, 2012

One for negation and idiom researchers

Northern Ireland expert Denis Murray, on BBC, talking with an interviewer about the Queen’s historic meeting with Martin McGuinness.

Interviewer: Would you say that until recently something like this would have been unthinkable?

Murray: I’d say more than that, until two years ago it was not even thinkable.

It’s pretty clear that Murray felt he was adding information. But why does “not even thinkable” mean more than “unthinkable”?

I believe he took the interviewer’s “unthinkable” to mean “you can (indeed must) think it, but you really should not do it”, and augmented by using “thinkable” to imply that “nobody would even have thought of it, or have needed to judge that it was a bad idea”

Posted by: Chris Brew | May 10, 2012

Kaggle and automated essay scoring

I went to DC yesterday to meet the winners of the first stage of the  Hewlett Foundation’s Automated Student Assessment Prize. It was a very interesting experience. For this stage, the task was to grade a range of essays that had been selected by the organizers, and for which human scores were available.

The first thing that struck me is that the winning teams were primarily data mining experts with an ability to pick up NLP and educational assessment research as needed. Some of them said they had read our papers, others that they had made everything up on their own. 

I’m enjoying reading Kaggle competitors’ experience reports and fitting the ideas into my thinking. One obvious practice is that teams often consist of several people, each of whom has a complete running system. The pattern is that competitors run separately for a while, then coalesce into teams who ensemble together their systems. If the systems are complementary, the collective does better than the individual components.


Posted by: Chris Brew | April 12, 2012

Pandas is useful, when enchanted

If you are Python devotee, and you write programs that munge words, the combination of http://pandas.pydata.org/ and http://packages.python.org/pyenchant/ is easy and efficient for tokenization, word counting, and all that. Recommended.

Posted by: Chris Brew | June 30, 2011

New Job

I just moved to the Educational Testing Service, to work on their cRater project, which is like the famous eRater essay-grading project, except with more semantics, and for short answers instead of essays. I THINK the c stands for content, but it could stand for “constructed response”, which is a psychometrics and educational testing thing. Learning fast, having fun, working with a great group of people. Note that the cRater link describes cRater as it was in 2004, not as it is now.

Posted by: Chris Brew | April 11, 2011

“You can observe a lot by just watching” Yogi Berra

While it isn’t always easy, I can usually tell where people were raised as well as where they were born. Koreans raised in Los Angeles have a style completely different from those born in Seoul; the English, en masse, look different from the Scots, and it just takes one look at those wacky triangular eyeglasses for me to know that a young lady is either French or getting that way. In the same way, just by looking, I can pretty much diagnose the families waiting for the Newark to Gatwick flight. Here’s one, prematurely greying father with John Lennon glasses, slightly older  mother with shoulder length hair, three blond boys with backpacks and crew cuts. I’m like, O.K. , he’s British, she’s American, all three boys born in the USA. Or, Asian looking father, fiftyish, no mother in the party, two young teen daughters, one classically Eurasian looking, the other blonder. Sure, I can do that: he’s born in Hong Kong, but doesn’t speak Cantonese well, one kid born in Shanghai, the other in canoe transit up the Amazon. Same father, I think, but the first girl’s mother is definitely working as a dogcatcher in Evansville, Indiana, and the second one’s mother once had that unfortunate accident with a hairnet and an avocado. Could these be the same person? Very likely, but I’m not infallible, while I know for sure that the father is part-time seal tamer and computer science professor, I can’t be be sure whether he’s a bigamist. It’s just a matter of assessing the evidence.

I can also tell what language people speak, because the patterns of vowels and consonants shape the face. Turkish oral surgeons spend 47% of their time unsticking the tongue tip from the roof of the mouth. “Who put the gluten in this agglutinative language?”, they cry. And did you know that Mick Jagger was raised Basque? His English accent is a fake: he stole it from a classmate at LSE, using 1960s recording technology and a hypnopaedic pillow. You don’t get those lips from an Indo-European language, let me tell ya! Angela Lansbury is Swedish, and Dick van Dyke really is a cockney. As a young man Rex Harrison sang Wagner’s Parsifal with Maria Callas in the Italian premiere at La Scala: the My Fair Lady thing is a front. Not many know that, but you can see all this in their faces.

Just by looking, I can tell whether your dog will develop cataracts (and whether your cat will develop doggeracts, should you care). Show me your friend’s wardrobe, and I can predict the mean rainfall over the Andes for the next two weeks. Two glances inside your purse and I can diagnose your psychological problems to eight decimal places AND predict your fashion preferences. Just from your diet, I can tell you not only your height, weight and hat size but your views on a wide range of social issues and the three last digits of your social security number. If you were raised by wolves, I can tell. If you were kept locked in a cupboard by your neglectful parents, I will spot it, and be able to offer career advice, speech therapy and a range of inexpensive  after-care options. If your father married his half sister and you were raised by a vengeful dwarf in the forest, I will know, and be the first to offer you a place to lay your sword. And advise you on whether the local fire brigade is any use for your unknowingly genetically suspect purpose. But I’m not special, I think most people could do that, just by looking.
Posted by: Chris Brew | March 26, 2011

How to do comparisons between machine learning schemes

Nice paper comparing 16 model selection and weighting schemes. Includes 58 benchmark datasets. The data analysis was done in the following way – for each dataset, rank the schemes. Then average the ranks. – use the Friedman test to test whether ranks are all equal ( – if ranks are not all equal, use the Nemenyi test (covered in papers by Demsar, Garcia et al  http://jmlr.csail.mit.edu/papers/volume9/garcia08a/ Ying Yang, Geoffrey I. Webb, Jesús Cerquides, Kevin B. Korb, Janice R. Boughton, Kai Ming Ting: To Select or To Weigh: A Comparative Study of Linear Combination Schemes for SuperParent-One-Dependence Estimators. IEEE Trans. Knowl. Data Eng. 19(12): 1652-1665 (2007), ISSN: 1041-4347 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

Posted by: Chris Brew | April 27, 2010

Why statisticians shouldn’t write movie titles

Never Give a Sucker an Asymptotically Even Flip
The Variational Enigma
The Fisher King
Independence Day
Return to Monte Casino
The Man Who Measured the Bank at Monte Carlo
Between 99 and 103 Dalmatians
8.5 +/- 0.2
The Metropolis Method (void where prohibited by law)
Improper Priors go wild on Cancun

Posted by: Chris Brew | April 6, 2010

Plain speaker’s guide to “any more” and “anymore:

First here’s a rephrasing of what Huddleston and Pullum’s epic Cambridge Grammar of English says about “any more” and similar adverbs. The main discussion is on p 710 and following, with other bits on 823 and 831

  1. They are polarity sensitive: this means that there is a difference in acceptability between “She isn’t here any more” and “She is here any more”. For many speakers, the first is OK, the second not.
  2. The difference between “any more” and “anymore” is a British/American spelling difference.
  3. You can line up “anymore” with “still” and “no longer”. They differ in how they work with negation.

My own impressions follow. Most speakers can say :

“She is still here” (i.e. she is here and has been for a while)
“She is still not here” (i.e. we are waiting, and she still hasn’t arrived),
“She is not here anymore” ,”She is no longer here” (in both cases, she was here, but now isn’t)

Many speakers find: “She is not still here”,”She is here anymore” awkward. For the first one the intended meaning is the same as the one expressed by “She is no longer here”. Some speakers, including me, blow a fuse when confronted with the second one, and don’t even understand what it means. For others, “anymore” can be used anywhere that “nowadays” is, with much the same meaning, so “She is here anymore” could be used (if you are, say, in a bar) when the person in question used to avoid the bar but now hangs out there on a regular basis. Similarly “Ice cream is cheap anymore” works for many people, but in my natural dialects, I  would have to either turn it round and say “Ice cream isn’t expensive anymore” or punt and say “Ice cream is cheap nowadays”.

Unfortunately, linguists have taken to confusing themselves and others by talking about “positive anymore”.  If they had called it “nowadays anymore” there would have  been no trouble. These adverbs are neither positive nor negative, just a little fussy about what kind of sentences they like to be wrapped up in. The “nowadays” translation helped me, and is from John Lawler. As he says

Apparently, for users of positive “anymore”, “nowadays” doesn’t
cut it anymore. Anymore, they use “anymore” instead. Or perhaps
only in certain speech contexts; the definitive sociolinguistic
study remains to be done.

I guess I can forgive him for using the term “positive”, because he puts it in quotes and gives an amusing example.

By the way, in Columbus, Ohio. ice cream really is cheap and good at Graeter‘s and  Jeni’s . No ice creams were consumed in the creation of this post, but several area shops are on high alert.

Older Posts »