Posted by: Chris Brew | January 31, 2017

In the dark

In darkness beats my heart exceeding slow
The clock, unsleeping, taps its metric round
Unbidden come my breaths, and silent go
An hour till dawn, no dreams, no light, no sound
Yet still my mind must race, and run, and churn
Upon the tasks that day alone can bring
And so it never will to sleep return
Till letters come, dogs bark, and doorbells ring.
I start, was that the bugle’s martial call?
Will I awake to pipes? To battle cry?
No need, it was the thermostat, that’s all
Those pipes they creak a bit, and so do I
I must arise, and make some buttered toast
This is, with coffee, what  I need the most.

An actual prizewinner. earned £20 from “The Oldie”.

Posted by: Chris Brew | January 31, 2017

Under the spotlight



Lights up, camera on, what’s my bit?
A poem? a song? Maybe a painting?
A two-handed drama as gritty as grit?
I can’t think of anything, I am found wanting.
I dropped my talent in the washing up
The Fairy Liquid dissolved it in the bowl
There used to be words, now there’s just stuff
Nothing to see when I open my soul
I wrote a sonnet, but it was eaten by the dog
I spilled the quatrains in my fruit salad
I caught the couplets sneaking out for a snog
The final line, frankly, was weak, shallow, and pallid.
In this house you’re remorselessly held to the standard
Do something brilliant in front of your grandad

Written in ten minutes for a family talent show. If you can’t make the last couplet work, remember that I speak British, so “standard” does rhyme with “grandad”. And look up “snog” if you need to. Other rhymes, well, ten minutes!


Posted by: Chris Brew | January 31, 2017

Soap story (another sonnet)


“Young man, what have you brought for me to play?”
I do try to please her, for when she’s glad,
She has the eyes of the young Lady Day.
They could well have worked: I thought that they had:
The nymphs and swains of my pastoral lay.
But her old dark eyes are deep cleansing pools
And her Faerie liquid washes my whimsy away.
Be it so gentle, she won’t allow fools.
Or my inadequate half-formed notions
I know it of old, her reproving frown
The surface tensions come with the potions
My Muse, with cigar, and yellow satin gown
And the dark sweet voice of strong black coffee
Is like “Honey pie, you can’t soft soap me!”

(this one was deemed incomprehensible by some readers, but I like it)

Posted by: Chris Brew | January 31, 2017

March in January

We do not choose the road that we must tread;
Select the gentlest slopes, the smoothest bends.
There is no steady, rising, upward thread;
This life is not corralled, nor known its ends.
Athwart our forward path the shadows stray
From places that we do not want to go,
Nightmares that on our rattled spirit prey
Whose dread truth we hope we need never know.
But still in ev’ry time there is a choice –
And, that day, many did elect to care.
Nor was it “all just words”, the people’s voice;
We made the promise that “We will be there.”
In daunting times that stray beyond the normal
Words and acts must move beyond the formal.

Posted by: Chris Brew | November 11, 2016

Running SOLR on a Mac

I have a need to explore a collection of text files and associated metadata, in order to find out what is going to be possible for automatic analyses of these files.

In service of this need, I have spent some time coming to grips with the Solr search engine, which is built on top of Lucene, and looks as if it might do the job. Solr is a bit daunting, with excellent but copious documentation that I find somewhat overwhelming.

Here is a small success. I work on a Mac that runs El Capitan, and I use  homebrew and Anaconda as my package managers.
Because of my last name, it is required by Federal law that I use homebrew, but I would anyway, because it is so helpful. Anaconda is also very convenient, especially for Python programmers.

The standard homebrew way of installing solr goes like this:

brew install solr

Prints a load of information, finishing with:

==> Caveats
To have launchd start solr now and restart at login:
  brew services start solr

So I did that. Then I wondered how my new Solr service was configured.
It’s hard to find out, so my next step was:

  brew services stop solr

Now I can (almost) follow the instructions at Getting Started with SolrCloud.  The “almost” is because homebrew has injected solr into my path, so I can actually type

solr -e cloud

rather than

bin/solr -e cloud

The guide assumes that you have downloaded a solr distribution and are working from within its top-level directory. Nothing wrong with that, but it is a slight change.

I accept all the defaults, and finish up with a running solr instance with an admin page at http://localhost:8983/solr.  Cloud mode sets up a collection called gettingstarted that is configured to try to infer the schema of documents that it sees.

Indexing some documents

OK, so next I tried indexing some documents, and had no joy from

post -c gettingstarted example/films/films.json

(suggested at It complains and fails to index any documents. Fortunately, something similar does work:

solr stop -all
brew services start solr
cat  cat example/films/README.txt 
solr create -c films
curl http://localhost:8983/solr/films/schema \\
-X POST -H 'Content-type:application/json' --data-binary '{
    "add-field" : {
    "add-field" : {
post -c films example/films/films.json

The curl command is necessary in order to override the guesses that Solr makes about the fields that it sees in the data.

This sequence works, and you can go into the admin console to see 1100 documents
in the films core. Solr has two closely-related concepts:

  1. a core: films is a core
  2. a collection gettingstarted was a collection

I don’t yet understand the difference between the two concepts. Sharp eyes will notice that the admin console now has a list of cores, whereas when running in cloud mode it had collections. Maybe this matters one day. For now, let’s move on.



[this is from Newnes’ Pictorial Knowledge, I grew up with this children’s encyclopaedia, published in 1930, second edition 1934. Volume 8 has a QA section with the above priceless title, extracts of which will be added as and when I feel moved to.The text begins as follows]

There are so many wonderful things in this world of ours that most of us, whether we be young or not so young, can think of countless questions we would like to ask about them if only we could find someone who would answer them for us. In the pages that follow are hundreds of such questions and each one is answered clearly and simply, in a form that each one of us can understand.



Why does a Balloon rise?

Because it is lighter than the body of air which it displaces. It is forced upwards by the difference between the upward and downward pressure of the air on it.

When did a man first ascend in a Balloon?

On November 21st, 1783, when two Frenchmen rose in a huge fire-balloon from Paris. The balloon attained a height of 3,000 feet, and came to earth about two miles from the starting point. The first ascent in a gas balloon was made at Paris on December 1st, 1783, by Professor Charles, who used hydrogen gas. He rose after sunset to a height of 3,000 feet, and that elevation was the first man to see the sun set a second time on the same day.


What is meant by a Fascist?

A member of the Italian society pledged to oppose Communism, and founded in 1919 by Signor Benito Mussolini. The word Fascist is connected with the fasces, or bundles of sticks carried in olden times by Roman lictors as emblems of their authority, and adopted by the Fascisti (pl. of Fascist) themselves. The Fascisti wear a black shirt as their uniform, and are spoken of as Blackshirts. The movement started by Mussolini spread all over Italy, quelled the revolutionaries who were trying to upset everything, and in the end made Mussolini practically the dictator of Italy. Societies on the same lines as that of the Fascisti have been formed in some other countries. [I have an idea which countries they mean. CB]

When was the Air Mail Service to India begun?

In March, 1929.

[more to come]


Posted by: Chris Brew | August 16, 2013

Measuring SVG text in the IPython notebook.

I spent a little time learning how to use the IPython notebook to get reliable information about the height and width of SVG text. This is tricky enough that it is worth putting out, because I’m sure it is a common need. Here are before and after images of the text.


The point of this is to enable good display of linguistic objects such as trees and dependency graphs, similar to what Sebastian Riedel has done with WhatsWrongWithMyNLP. Watch this space. Currently, the results look good to me in the IPython notebook under Safari. Comments and advice more than welcome.

Posted by: Chris Brew | June 27, 2012

One for negation and idiom researchers

Northern Ireland expert Denis Murray, on BBC, talking with an interviewer about the Queen’s historic meeting with Martin McGuinness.

Interviewer: Would you say that until recently something like this would have been unthinkable?

Murray: I’d say more than that, until two years ago it was not even thinkable.

It’s pretty clear that Murray felt he was adding information. But why does “not even thinkable” mean more than “unthinkable”?

I believe he took the interviewer’s “unthinkable” to mean “you can (indeed must) think it, but you really should not do it”, and augmented by using “thinkable” to imply that “nobody would even have thought of it, or have needed to judge that it was a bad idea”

Posted by: Chris Brew | May 10, 2012

Kaggle and automated essay scoring

I went to DC yesterday to meet the winners of the first stage of the  Hewlett Foundation’s Automated Student Assessment Prize. It was a very interesting experience. For this stage, the task was to grade a range of essays that had been selected by the organizers, and for which human scores were available.

The first thing that struck me is that the winning teams were primarily data mining experts with an ability to pick up NLP and educational assessment research as needed. Some of them said they had read our papers, others that they had made everything up on their own. 

I’m enjoying reading Kaggle competitors’ experience reports and fitting the ideas into my thinking. One obvious practice is that teams often consist of several people, each of whom has a complete running system. The pattern is that competitors run separately for a while, then coalesce into teams who ensemble together their systems. If the systems are complementary, the collective does better than the individual components.


Posted by: Chris Brew | April 12, 2012

Pandas is useful, when enchanted

If you are Python devotee, and you write programs that munge words, the combination of and is easy and efficient for tokenization, word counting, and all that. Recommended.

Older Posts »