@@ -57,7 +57,7 @@ The concreteness values of words are somehow used as a filtering step in this mu
A json file that contains a dictionary of words and their concreteness values is required for the multimodal dataset creation. In the repository, there is such a file "data/concreteness/concretenessValuesMscoco.json" containing the concreteness values of the "MS COCO train2014 dataset".
That concreteness file can be created with the[implementation](https://github.com/victorssilva/concreteness) of [Visual Concreteness](https://arxiv.org/abs/1804.06786).
That concreteness file can be created with this[implementation](https://github.com/victorssilva/concreteness) of [Visual Concreteness](https://arxiv.org/abs/1804.06786).
### Setup Database
...
...
@@ -164,7 +164,7 @@ If we want to define a word from the concreteness values file to be concrete/dep
python main.py --concreteness_threshold 50
```
The image retrieval with CLIP can be influenced with the parameters "--candidate_imgs", "--sent_img_similarity" and "--focus_word_img_similarity". The choice of the first two parameters is based on [paper](https://www.inf.uni-hamburg.de/en/inst/ab/lt/publications/2022-wangetal-lrec.pdf). The last parameter then bases on the second one. Especially, increasing the last two ones might result in more suitable images but less multimodal sentences
The image retrieval with CLIP can be influenced with the parameters "--candidate_imgs", "--sent_img_similarity" and "--focus_word_img_similarity". The choice of the first two parameters is based on this [paper](https://www.inf.uni-hamburg.de/en/inst/ab/lt/publications/2022-wangetal-lrec.pdf). The last parameter then bases on the second one. Especially, increasing the last two ones might result in more suitable images but less multimodal sentences