PDF version of this bibliography
  Akiba, Y., Imamura, K., and Sumita, E. 2001. Using multiple edit distances to automatically rank machine translation output. In Proceedings of the MT Summit VIII, 15-20, Santiago de Compostela, Spain.
[ bib | .pdf ]
  Bangalore, S., Rambow, O., and Whittaker, S. 2000. Evaluation metrics for generation. In Proceedings of the 1st International Conference on Natural Language Generation, 1-8, Mitzpe Ramon, Israel.
[ bib | .pdf ]
  Belz, A. and Kilgarriff, A. 2006. Shared-task evaluations in HLT: Lessons for NLG. In Proceedings of the 4th International Conference on Natural Language Generation, 133-135, Sydney, Australia.
[ bib | http ]
  Belz, A. and Reiter, E. 2006. Comparing automatic and human evaluation of NLG systems. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, 313-320, Trento, Italy.
[ bib | .pdf ]
  Colineau, N., Paris, C., and Linden, K. V. 2002. An evaluation of procedural instructional text. In Proceedings of the 2nd International Conference on Natural Language Generation.
[ bib | .pdf ]
  Dale, R. and Mellish, C. 1998. Towards the evaluation of natural language generation. In Proceedings of the 1st International Conference on Evaluation of Natural Language Processing Systems, Granada, Spain.
[ bib | .pdf ]
  Gatt, A. 2006. Structuring knowledge for reference generation: A clustering algorithm. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy.
[ bib | .pdf ]
  Gupta, S. and Stent, A. 2005. Automatic evaluation of referring expression generation using corpora. In Proceedings of the Workshop on Using Corpora for Natural Language Generation, 1-6, Brighton, UK.
[ bib | .pdf ]
  Jones, K. S. and Galliers, J. 1996. Evaluating Natural Language Processing Systems. Springer Verlag, Berlin/Heidelberg.
[ bib ]
  Levine, J. and Mellish, C. 1995. The idas user trials: Quantitative evaluation of an applied natural language generation system. In Proceedings of the 5th European Workshop on Natural Language Generation, number 75-94, Leiden, The Netherlands.
[ bib ]
  McKeown, K. R. 2006. Lessons learned from large scale evaluation of systems that produce text: Nightmares and pleasant surprises. In Proceedings of the 4th International Conference on Natural Language Generation, 3-5, Sydney, Australia.
[ bib | http ]
  Mellish, C. and Dale, R. 1998. Evaluation in the context of natural language generation. Computer Speech and Language, 12(4):349-373.
[ bib ]
  Meteer, M. and McDonald, D. 1991. Evaluation for generation. In Neal, J. G. and Walter, S. M. (Eds.), Natural Language Processing Systems Evaluation Workshop: Technical Report RL-TR-91-362, 127-131. Rome Laboratory, Griffiss Air Force Base, NY.
[ bib ]
  Nenkova, A. and Passonneau, R. 2004. Evaluating content selection in summarization: The pyramid method. In Main Proceedings of HLT-NAACL 2004, 145-152, Boston, Massachusetts, USA.
[ bib | .pdf ]
  Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. 2001. Bleu: a method for automatic evaluation of machine translation. Technical report, IBM Thomas J. Watson Research Center, Yorktown Heights, NY.
[ bib | .pdf ]
  Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 311-318, Philadelphia, NY.
[ bib | .pdf ]
  Paris, C., Colineau, N., and Wilkinson, R. 2006. Evaluations of NLG systems: Common corpus and tasks or common dimensions and metrics? In Proceedings of the 4th International Conference on Natural Language Generation, 127-129.
[ bib | http ]
  Rambow, O., Rogati, M., and Walker, M. A. 2001. Evaluating a trainable sentence planner for a spoken dialogue system.
[ bib | .pdf ]
  Reiter, E. and Belz, A. 2006. Geneval: A proposal for shared-task evaluation in NLG. In Proceedings of the 4th International Conference on Natural Language Generation, 136-138, Sydney, Australia.
[ bib | http ]
  Reiter, E. and Dale, R. 2000. Building Natural Language Generation Systems. Cambridge University Press.
[ bib ]
  Reiter, E. and Sripada, S. 2002. Should corpora texts be gold standards for NLG? In Proceedings of the 2nd International Conference on Natural Language Generation, 97-104.
[ bib | .pdf ]
  Reiter, E., Robertson, R., Lennox, A. S., and Osman, L. 2001. Using a randomised controlled clinical trial to evaluate an NLG system.
[ bib | .pdf ]
  Scott, D. and Moore, J. 2006. An NLG evaluation competition? eight reasons to be cautious. Technical Report 2006/09, Department of Computing, The Open University, UK.
[ bib | .pdf ]
  van Deemter, K., van der Sluis, I., and Gatt, A. 2006. Building a semantically transparent corpus for the generation of referring expressions. In Proceedings of the 4th International Conference on Natural Language Generation, 130-132, Sydney, Australia.
[ bib | http ]
  van der Sluis, I. and Krahmer, E. 2004. Evaluating multimodal NLG using production experiments. In Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal.
[ bib | .pdf ]
  Viethen, J. and Dale, R. 2006a. Algorithms for generating referring expressions: Do they do what people do? In Proceedings of the 4th International Conference on Natural Language Generation, 63-70, Sydney, Australia.
[ bib | http ]
  Viethen, J. and Dale, R. 2006b. Towards the evaluation of referring expression generation. In Proceedings of the 4th Australasian Language Technology Workshop, 115-122, Sydney, Australia.
[ bib | .pdf ]

This file has been generated by bibtex2html 1.79

Last updated: 5 Jan 2007