Refined Commonsense Knowledge from Large-Scale Web Contents

The pipeline is executed in the following order:

nlp_pipeline.pipeline
open_ie.open_ie
triple_filtering.filter
triple_grouping.group_per_c4_part, triple_grouping.group_all, triple_grouping.get_frequent_triples
triple_clustering.precompute_embeddings, triple_clustering.clustering
conceptnet_mapping.inference
ranking
final_filtering.final_filtering

Global configurations can be found in app_config.py.

Files needed for the pipeline to run are:

Precomputed similarity scores between C4 documents and Wikipedia articles: https://nextcloud.mpi-inf.mpg.de/index.php/s/nJSSW5QBQR3XoxH (cf. triple_filtering/filter.py)
Subjects: https://nextcloud.mpi-inf.mpg.de/index.php/s/TiSm3rrJ9kEqfm8 (cf. triple_filtering/filter.py)
ConceptNet mapping train/dev files: https://nextcloud.mpi-inf.mpg.de/index.php/s/JeLRgsiNykcnRbs (cf. conceptnet_mapping/finetune.py)

If you use Ascent++, please cite the following paper:

@ARTICLE{ascentpp,
  author={Nguyen, Tuan-Phong and Razniewski, Simon and Romero, Julien and Weikum, Gerhard},
  journal={IEEE Transactions on Knowledge and Data Engineering}, 
  title={Refined Commonsense Knowledge from Large-Scale Web Contents}, 
  year={2022},
  doi={10.1109/TKDE.2022.3206505}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Refined Commonsense Knowledge from Large-Scale Web Contents

About

Uh oh!

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
conceptnet_mapping		conceptnet_mapping
data		data
final_filtering		final_filtering
libs		libs
nlp_pipeline		nlp_pipeline
open_ie		open_ie
ranking		ranking
triple_clustering		triple_clustering
triple_filtering		triple_filtering
triple_grouping		triple_grouping
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app_config.py		app_config.py
requirements.txt		requirements.txt

License

phongnt570/large-scale-csk-extraction

Folders and files

Latest commit

History

Repository files navigation

Refined Commonsense Knowledge from Large-Scale Web Contents

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages