We use cookies on this site to enhance your user experience. Do You agree?

Read more

12. LSP – workshop

Weather – Compiling and analysing your own ESP corpus

I. Create your own corpus of weather forecasts or weather news.

Activity 1

    1. Find relevant Internet sites. You may use the following URLs:

read more

digging for dirt

muckraking

mudslinging

sleaze

smear campaign

campaign of vilification

Teflon coating

NOTE: Muckraking and mudslinging are also spelt with a hyphen or as two words.

read more

  1. Indentify relevant texts.
  2. Copy each text that you judge suitable for your corpus and paste it into Word or any other word processor. You can store each text in a separate file or you can put all texts jointly in one file.
  3. When building your own corpus you are advised to keep track of your sources by noting the source of each text in a separate table and cross-referencing it to the text (see the tables for my urology corpora). For the sake of today’s activity you do not need to do it.
  4. Try to build a corpus of at least 5 thousand tokens. For your course project you will need a corpus six times larger (30 thousand tokens).
  5. After you have finished copying and pasting your texts, save your file/files. Make sure you save it/them in the txt format.

Congratulations. You have just created your own corpus. The next step is to analyze it.

II. Analysing your corpus

 Activity 2

  1. Open your concordancer and choose the file/files you will be working with. If you are working with AltConc go to File, Open File(s).
  2. One of the most basic corpus-based analyses is studying a wordlist. Create a wordlist for your corpus. In AntConc go to the Word List bookmark. In the Display options tick Treat all data as lowercase and press the Start button at the bottom of the window. The list should appear on the screen. It is arranged by frequency. You can sort it into an alphabetical order by choosing Sort by word in the Sort by pull-down menu and pressing the Sort button.
  3. Study the words in the list. Which items do you find relevant to the topic? When in doubt move your cursor to the questionable item and press Enter. You will see the concordances of that word on the screen. Study them in order to decide if you need to include this word in your teaching list or not. To go back to your word list press the Word List bookmark.
  4. Copy the word list from the concordancer to Word or another word processor. Delete the items you have found irrelevant and keep those which should make up your teaching list. Save the file.

Congratulations! You have just created a list of items to be taught in your ESP course.

Activity 3

  1. Another more automatic way of extracting terminology from a corpus is using the keyword function. This function uses statistical tests to find the words with unusually high frequency in comparison with a reference corpus. We assume that these items make up the terminology of a text. For this analysis you will need a reference (General English) corpus. Alternatively, you can use a word-list generated from a General English Corpus. For this exercise, I have made such a list available for you. It has been retrieved from a corpus called FlOB_B. Follow the link below and save the list on your computer.
  2. If you are using AntConc go to Tool preferences, click on Keyword List and then click the button Use word list(s) than Choose Files. Choose the reference list you have saved on your computer and press the Apply button.
  3. Now go to the Keyword list bookmark and press the Start button at the bottom of the screen. The list of keywords appears on the screen. Notice that it is much shorter than the word list we have studied earlier.
  4. Analyze the keyword list in the same way as you analyzed the word list earlier.
  5. Copy the keyword list from the concordancer to Word or another word processor. Delete the items you have found irrelevant and keep those which should make up your teaching list. Save the file.
  6. Which of the two analyses produced more relevant results?

The next step in your analyses can be extracting some chunks and collocations for teaching in your ESP course/class.

Activity 4

  1. Choose a word for which you want to extract chunks and collocations. It should be a fairly frequent word.
  2. Go to the Clusters bookmark. Type in your word in the box at the bottom of the screen (e.g. rain). Change the Cluster size settings from minimum 2 to maximum 4 words and press the Start button. A list of clusters appears on the screen.
  3. As you can probably see your list contains a lot of “noise” (that is irrelevant clusters). Eliminate them following the same procedure as in the case of word list and keyword list analyses. Save the meaningful clusters in a file.
  4. Now go to the Collocates bookmark. Type in your word and change the Window span setting to 4 words on the left (4L) and 4 words on the right (4R). Now press the Start button. The list of collocates appears on the screen.
  5. Study the collocates by clicking on them and studying them in the concordance lines. Note that you get all the occurrences of the collocate word, not only those in collocation with your target word.
  6. Identify interesting collocations and example sentences and note them down in the file containing clusters.
  7. Follow the same procedure for two or three items.

By now you have analyzed your corpus in enough detail to prepare a short ESP course/class. Put the results of your analysis in one document with appropriate headings and turn the document in as your assignment.