glossary

Scraping

Languages:  de  el  en  es  fr  he  hr  id  is  it  ja  ko  lt  lv  my  ne  nl_BE  pt_BR  ro  ru  zh_CN  zh_TW 

Extracting data from a non-machine-readable source, such as a website or a PDF document, and creating structured data from the result. Screen-scraping a dataset requires dedicated programming and is expensive in programmer time, so is generally done only after all other attempts to get the data in structured form have failed. Legal questions may arise about whether the scraping breaches the source website’s copyright or terms of service.