#1199: convert GD field notes to html and add to fieldnotes.makingandknowing - docx/gdoc

opened by njr2128

All field notes from Fall 2018 are housed in GD, and were not included in the wikispace archive subdomain under the Project's website.

These need to be converted from googledoc / docx to html and added to the subdomain so that they can be linked to from annotations


njr2128 commented:

Downloaded entire directory "Field Notes Fall 2018" within "Student Files" Extracted the zipped file Sanitized file names: detox -r Field\ Notes\ Fall\ 2018 [njr2128@LAPTOP-0SMCQ75L:/mnt/c/users/naomi/Downloads$ detox -r Field\ Notes\ Fall\ 2018]


njr2128 commented:

recursively find all docx files and grab their basenames: `for DOCX in `find Field_Notes_Fall_2018 -name '*.docx'`;do BASENAME=`basename -s .docx $DOCX`;echo $BASENAME;done`

In NJR's command line: njr2128@LAPTOP-0SMCQ75L:/mnt/c/users/naomi/Downloads$ for DOCX in `find Field_Notes_Fall_2018 -name '*.docx'`;do BASENAME=`basename -s .docx $DOCX`;echo $BASENAME;done


njr2128 commented:

For every .docx file, in /tmp create directory using basename of file, creating a media directory for all linked media in the same result directory for DOCX in `find Field_Notes_Fall_2018 -name '*.docx'`;do BASENAME=`basename -s .docx $DOCX`;pandoc "$DOCX" --extract-media=/tmp/"$BASENAME" -f docx -t html -s -o /tmp/"$BASENAME"/"$BASENAME".html;done

In NJR command line: njr2128@LAPTOP-0SMCQ75L:/mnt/c/users/naomi/Downloads$ for DOCX in `find Field_Notes_Fall_2018 -name '*.docx'`;do BASENAME=`basename -s .docx $DOCX`;pandoc "$DOCX" --extract-media=/tmp/"$BASENAME" -f docx -t html -s -o /tmp/"$BASENAME"/"$BASENAME".html;done


njr2128 commented:

need to rewrite URLs for linked media to make them relative.

Use oXygen. For now, open any project, and use find/replace function, but look in specific directory.

step 1: go to "file" and then "Find/Replace in Files":

image

step 2: change specified path to the directory where html files are:

image

step 3: choose specified path/directory (in this case, NJR's downloads folder, under a folder "tmp"):

image