Skip to main content

What corpora HAVE done for us




Sinclair's seminal work -

the bible of corpus linguistics

In this post I would like to defend linguistic corpora and their relevance to the ELT field which Hugh Dellar raises doubts about.



Years ago before
I became familiar with corpus tools (corpus as in linguistic corpus = "collection
of samples of real-world texts stored on computer"; plural = corpora) we
had a fierce debate with my colleagues whether to use the preposition to
or for after the noun hint. We wanted to produce posters for
English learning centres we had set up for a number of high schools and each
poster was meant to provide "Hints for/to speaking / listening etc".

Emails
were sent back and forth about what preposition should be used and the argument
inevitably turned to the British/American distinction until somebody used
Google Fight to compare hints to and hints for. Google Fight provided us with
pseudo-scientific evidence that hints for is slightly more common than hints to
– it was back in the days when I was still blissfully unaware that Google search yields different results for different
people and sometimes even for the same person on different computers!







That was before I
discovered the British National Corpus hosted on the Brigham Young University website. Had I
discovered it earlier I would have searched for Hints + Preposition and found
that hints on something is actually more common than the other two options we were
vehemently debating.





In his recent article Whathave corpora ever done for us, Hugh Dellar raises doubts about the usefulness
of corpora to the ELT field. While not completely dismissive of corpus research
and its value, Hugh basically argues that its effect on the language teaching
profession has been enslaving rather than liberating. I find Hugh's polemic surprising
considering the fact that corpus linguistics is what gave impetus to the Lexical
approach, of which Hugh is a staunch advocate (used the corpus here to look up
a "juicy" adjective for "advocate"!)





Objective view of language


In the past 30 years corpus research has provided irrefutable evidence about how language works, not least that language is
highly patterned in that it largely consists of recurrent lexico-grammatical combinations.






Starting with the
Collins COBUILD project, corpora have revolutonised lexicography and changed the face of the modern dictionary.
These days most respectable dictionaries – an indispensable tool for
learners and teachers alike - include examples drawn from the corpus, frequency
information and often register variation (if a word is more suitable for formal
or informal contexts). 







Corpora have shed
light on many aspects of language which were previously described based on intuition. Instead of groping in the dark and anecdotal evidence we now have access to authentic language data. For example, in the past many grammar books presented "any" as a sort
of transformation of "some" used in negatives and questions.







I have some time. – I don't
have any time. – Do you have any time?





Corpus research has
shown that any is more common in affirmative sentences (50% of all usage of
"any") and not as frequent in questions (only 10%) as prescriptive
grammarians would have you believe.





Frequency of lexical or grammatical items is useful for deciding which materials should be included in a syllabus. This is not to say that these should not be balanced by another consideration: relevance to the learner.






Corpora in the classroom: a boon or bane?


However Hugh’s main argument
of corpora is its applicability to classroom teaching and relevance to teachers
themselves. As a teacher I find corpora invaluable. Just the other day a student
asked me about the difference between classic and classical. I came up with
classical music and classic mistake off the top of my head but had to consult a corpus to find further examples:





classic example / case
/ symptoms / mistake / movie


classical music / composer
/ tradition





Such
puzzles with confusable words can be easily solved by using the Compare
function in BNC or COCA







No doubt some people
are walking dictionaries and can (off the cuff) rattle off examples of usage but
I would look it up in a corpus. Very often I give my students an answer about how a word is used and then
consult a corpus or (corpus-based) dictionary to confirm my hunch. I am often
right but sometimes I overlook certain patterns. And why not get learners to look up the answers themselves? Although data-driven learning (DDL) hasn't gained much popularity, there is some evidence that getting students to study language data (concordances) by themselves is beneficial to vocabulary learning.





Finally, Hugh argues that corpora make English
as a foreign language unnecessarily foreign for non-native teachers by emphasizing certain dubious features of spoken grammar (e.g. "like" for reported speech) that we
don't really need to teach learners. This is particularly ironic because Innovations and, to a lesser degree, Outcomes - coursebooks co-authored by Hugh Dellar - are packed with colloquialisms. Innovations Upper-Intermediate has a whole page devoted to
vague language (sort of, kind of, -ish) - an important feature of spoken grammar of English. Perhaps I wouldn't
teach "like" for productive use in an EFL context. But what about the determiner "this" which has a markedly different use in spoken language, as corpus studies have revealed? In contrast to written language, we often use "this" to refer to things NOT previously mentioned in spoken narratives to make them more vivid.




I saw this weird guy on the train yesterday.

And then there was this loud pop, like something exploded.






Corpora have provided us with more accurate language descriptions and informed dictionaries,  grammar reference books and pedagogical materials. With various corpora and "corpus-light" tools (see here) now widely available online, corpora are no longer a remit of linguists but a valuable resource for teachers and learners.



For another rebuttal of Hugh Dellar's argument, see Mura Nava's post here

Comments

Popular posts from this blog

Benefits Of Healthy eating Turmeric every day for the body

One teaspoon of turmeric a day to prevent inflammation, accumulation of toxins, pain, and the outbreak of cancer.  Yes, turmeric has been known since 2.5 centuries ago in India, as a plant anti-inflammatory / inflammatory, anti-bacterial, and also have a good detox properties, now proven to prevent Alzheimer's disease and cancer. Turmeric prevents inflammation:  For people who

Women and children overboard

It's the  Catch-22  of clinical trials: to protect pregnant women and children from the risks of untested drugs....we don't test drugs adequately for them. In the last few decades , we've been more concerned about the harms of research than of inadequately tested treatments for everyone, in fact. But for "vulnerable populations,"  like pregnant women and children, the default was to exclude them. And just in case any women might be, or might become, pregnant, it was often easier just to exclude us all from trials. It got so bad, that by the late 1990s, the FDA realized regulations and more for pregnant women - and women generally - had to change. The NIH (National Institutes of Health) took action too. And so few drugs had enough safety and efficacy information for children that, even in official circles, children were being called "therapeutic orphans."  Action began on that, too. There is still a long way to go. But this month there was a sign that

Not a word was spoken (but many were learned)

Video is often used in the EFL classroom for listening comprehension activities, facilitating discussions and, of course, language work. But how can you exploit silent films without any language in them? Since developing learners' linguistic resources should be our primary goal (well, at least the blogger behind the blog thinks so), here are four suggestions on how language (grammar and vocabulary) can be generated from silent clips. Split-viewing Split-viewing is an information gap activity where the class is split into groups with one group facing the screen and the other with their back to the screen. The ones facing the screen than report on what they have seen - this can be done WHILE as well as AFTER they watch. Alternatively, students who are not watching (the ones sitting with their backs to the screen) can be send out of the classroom and come up with a list of the questions to ask the 'watching group'. This works particularly well with action or crime scenes with