Corpus tasks at discourse level

 

 

Task 1

·             Presented below are two top-50 word lists. First categorize the words from each list into function words and lexical words; then observe similarities and difference between the two with special attention on the use of NPs, VPs, tenses and modal verbs; finally generalize patterns you’ve observed.

 

 

Learner top 50 list                       

Expert Top 50 list

No

Word

Token

%

 

No

Word

Token

%

1

THE

12,892

6.29

 

1

THE

12,781

5.42

2

OF

6,567

3.21

 

2

OF

8,168

3.46

3

TO

6,166

3.01

 

3

AND

6,699

2.84

4

AND

5,381

2.63

 

4

IN

6,382

2.71

5

IN

5,224

2.55

 

5

TO

6,093

2.58

6

A

3,096

1.51

 

6

A

4,521

1.92

7

STUDENTS

2,937

1.43

 

7

THAT

3,516

1.49

8

IS

2,513

1.23

 

8

FOR

2,349

1

9

THAT

2,426

1.18

 

9

AS

2,315

0.98

10

FOR

1,871

0.91

 

10

IS

2,195

0.93

11

VOCABULARY

1,711

0.84

 

11

WITH

1,686

0.72

12

AS

1,644

0.8

 

12

THIS

1,626

0.69

13

ENGLISH

1,555

0.76

 

13

ON

1,623

0.69

14

IT

1,497

0.73

 

14

ENGLISH

1,368

0.58

15

ARE

1,488

0.73

 

15

LANGUAGE

1,332

0.56

16

WORDS

1,464

0.71

 

16

ARE

1,261

0.53

17

LEARNING

1,439

0.7

 

17

OR

1,238

0.53

18

THEY

1,420

0.69

 

18

WAS

1,175

0.5

19

THEIR

1,346

0.66

 

19

THEIR

1,165

0.49

20

ON

1,288

0.63

 

20

BE

1,159

0.49

21

WITH

1,284

0.63

 

21

NOT

1,126

0.48

22

I

1,212

0.59

 

22

WERE

1,087

0.46

23

BE

1,200

0.59

 

23

BY

1,048

0.44

24

THIS

1,142

0.56

 

24

IT

1,031

0.44

25

WORD

1,113

0.54

 

25

THEY

965

0.41

26

BY

988

0.48

 

26

AN

954

0.4

27

NOT

980

0.48

 

27

FROM

953

0.4

28

CAN

917

0.45

 

28

STUDENTS

950

0.4

29

WAS

914

0.45

 

29

HAVE

892

0.38

30

LANGUAGE

833

0.41

 

30

WHICH

790

0.34

31

MORE

828

0.4

 

31

I

740

0.31

32

WERE

809

0.4

 

32

THESE

737

0.31

33

FROM

779

0.38

 

33

MORE

717

0.3

34

TEACHERS

774

0.38

 

34

AT

668

0.28

35

OR

764

0.37

 

35

ONE

668

0.28

36

HAVE

742

0.36

 

36

HER

647

0.27

37

WHICH

741

0.36

 

37

WORDS

638

0.27

38

READING

726

0.35

 

38

WE

606

0.26

39

TEACHING

683

0.33

 

39

STUDY

587

0.25

40

USE

672

0.33

 

40

LEARNERS

582

0.25

41

THEM

670

0.33

 

41

LEARNING

567

0.24

42

LEARNERS

669

0.33

 

42

SHE

546

0.23

43

ALSO

601

0.29

 

43

TEACHERS

532

0.23

44

STRATEGIES

553

0.27

 

44

BUT

527

0.22

45

ONE

545

0.27

 

45

MAY

527

0.22

46

SOME

544

0.27

 

46

ALSO

523

0.22

47

AN

519

0.25

 

47

OTHER

513

0.22

48

NEW

508

0.25

 

48

THAN

506

0.21

49

QUESTIONS

499

0.24

 

49

BETWEEN

497

0.21

50

WHEN

491

0.24

 

50

SUCH

489

0.21