In the present study, we introduce a novel methodology for the harmonization and standardization of locations associated with patent transactions recorded at the USPTO from 2005 to 2022. Using natural language processing (NLP) techniques in conjunction with search engine-based web knowledge graphs, our method comprises four phases: data pre-processing, semantic clustering, exploitation of web-knowledge graphs, and API-driven harmonization. Initiating our analysis with a dataset of 63,838 unique locations, our methodology effectively reduces this number by more than 50 %. This approach exhibits an accuracy rate of approximately 92 %. The resulting geolocated dataset of companies' patent transactions offers a valuable resource for fine-grained geographical analyses of the markets for technology; in particular, we provide examples of relevant economic insights which can be learned from looking at the geographical patterns of those transactions.

Leveraging NLP and web knowledge graphs to harmonize locations: A case study on US patent transactions

Ascione G. S.
Primo
;
2024-01-01

Abstract

In the present study, we introduce a novel methodology for the harmonization and standardization of locations associated with patent transactions recorded at the USPTO from 2005 to 2022. Using natural language processing (NLP) techniques in conjunction with search engine-based web knowledge graphs, our method comprises four phases: data pre-processing, semantic clustering, exploitation of web-knowledge graphs, and API-driven harmonization. Initiating our analysis with a dataset of 63,838 unique locations, our methodology effectively reduces this number by more than 50 %. This approach exhibits an accuracy rate of approximately 92 %. The resulting geolocated dataset of companies' patent transactions offers a valuable resource for fine-grained geographical analyses of the markets for technology; in particular, we provide examples of relevant economic insights which can be learned from looking at the geographical patterns of those transactions.
2024
2024
Patent transactions; Patent assignee; Patent data harmonization; Natural language processing; Knowledge graph
Ascione, G. S.; Vezzulli, A.
File in questo prodotto:
File Dimensione Formato  
Ascione_Vezzulli_2024.pdf

non disponibili

Tipologia: Versione Editoriale (PDF)
Licenza: Copyright dell'editore
Dimensione 2.43 MB
Formato Adobe PDF
2.43 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11383/2185591
 Attenzione

L'Ateneo sottopone a validazione solo i file PDF allegati

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact