Skip to content

Conversation

@yuvipanda
Copy link
Collaborator

@yuvipanda yuvipanda commented May 22, 2021

Looks like OpenRefine is used in wikidata a bit
https://www.wikidata.org/wiki/Wikidata:Tools/OpenRefine,
so should be useful here

You can test this with:

$ docker build -t paws images/singleuser
$ docker run -p 8888:8888 paws jupyter notebook --ip=0.0.0.0

This should start a notebook server, and print a URL with a token
you can use to connect to the docker container & test the image.
Since this uses port forwarding, it'll most likely only work if you
are running docker locally (and not via docker-machine)

You can open RStudio by going to 'New -> OpenRefine'

Depends on #64, which depends on #62

yuvipanda added 4 commits May 22, 2021 19:53
Uses github.com/jupyterhub/jupyter-rsession-proxy/ to provide
RStudio :)

You can test this with:

$ docker build -t paws images/singleuser
$ docker run -p 8888:8888 paws jupyter notebook --ip=0.0.0.0

This should start a notebook server, and print a URL with a token
you can use to connect to the docker container & test the image.
Since this uses port forwarding, it'll most likely only work if you
are running docker locally (and not via docker-machine)
Looks like OpenRefine is used in wikidata a bit
https://www.wikidata.org/wiki/Wikidata:Tools/OpenRefine,
so should be useful here
chicocvenancio
chicocvenancio previously approved these changes May 22, 2021
@chicocvenancio chicocvenancio merged commit 38872f3 into toolforge:master May 22, 2021
@chicocvenancio
Copy link
Member

This is live.
Two thing that we might want to look at is ram usage and the initial time to bring up openrefine. It seems we get a 503 if it takes a while to start. And on the logs it complaing about only having 1.4 Gi of ram.

Using refine.ini for configuration
You have 16041M of free memory.
Your current configuration is set to use 1400M of memory.
OpenRefine can run better when given more memory. Read our FAQ on how to allocate more memory here:
https://github.com/OpenRefine/OpenRefine/wiki/FAQ:-Allocate-More-Memory
/usr/bin/java -cp server/classes:server/target/lib/* -Xms1400M -Xmx1400M -Drefine.memory=1400M -Drefine.max_form_content_size=1048576 -Drefine.verbosity=info -Dpython.path=main/webapp/WEB-INF/lib/jython -Dpython.cachedir=/home/paws/.local/share/google/refine/cachedir -Drefine.data_dir=/home/paws -Drefine.webapp=main/webapp -Drefine.port=3333 -Drefine.host=127.0.0.1 com.google.refine.Refine
Starting OpenRefine at 'http://127.0.0.1:3333/'

18:52:22.971 [            refine_server] Starting Server bound to '127.0.0.1:3333' (0ms)
18:52:23.024 [            refine_server] refine.memory size: 1400M JVM Max heap: 1419116544 (53ms)
18:52:23.040 [            refine_server] Initializing context: '/' from '/srv/openrefine/webapp' (16ms)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/srv/openrefine/server/target/lib/slf4j-log4j12-1.7.18.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/srv/openrefine/webapp/WEB-INF/lib/slf4j-log4j12-1.7.18.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[E 2021-05-22 18:52:25.140 SingleUserNotebookApp log:174] 503 GET /user/chicocvenancio/openrefine/ (chicocvenancio@192.168.33.128) 3005.34ms
18:52:25.666 [                   refine] Starting OpenRefine 3.4.1 [437dc4d]... (2626ms)
18:52:25.666 [                   refine] initializing FileProjectManager with dir (0ms)
18:52:25.667 [                   refine] /home/paws (1ms)
18:52:25.728 [       FileProjectManager] Failed to load workspace from any attempted alternatives. (61ms)

@crookedstorm
Copy link
Collaborator

PAWS pods only get 3GB of RAM right now in their limit-ranges. There have been asks to increase that, but we may need to resize all the workers and quota for the project as well. It seems like you could feed refine more memory, but not much more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants