I'm getting back into Cascalog after a long pause and am finding the
project.clj and ecosystem, version compatibility,
dependencies/dev-dependencies have mostly changed with lein2. Thanks for
putting "cascalog-hello" together (https://github.com/ctdean/cascalog-hello).
It helps a lot.
That said, I'm seeing a few errors when I run locally at a lein2 repl,
along with an INFO warning that class not found exceptions may happen. How
can I specify that it finds the classes it requires when it runs locally
and assure it does the right thing when on EC2?
Example:
(??- (freq-count-query [["a line of text line many"] ["words line count
line text"]]))
12/07/17 11:02:04 INFO *util.HadoopUtil: using default application jar, may
cause class not found exceptions on the cluster*
12/07/17 11:02:04 INFO planner.HadoopPlanner: using application jar:
/nfs/home/aaelony/.m2/repository/cascading/cascading-hadoop/2.0.0/cascading-hadoop-2.0.0.jar
12/07/17 11:02:04 INFO property.AppProps: using app.id:
569954D88846226B2D84D7F1AD64FEC3
12/07/17 11:02:04 INFO hadoop.TupleSerialization: using default comparator:
cascalog.hadoop.DefaultComparator
12/07/17 11:02:04 INFO util.Version: Concurrent, Inc - Cascading 2.0.0
ClassNotFoundException org.codehaus.jackson.map.JsonMappingException
java.net.URLClassLoader$1.run (URLClassLoader.java:200)
12/07/17 11:02:04 INFO flow.Flow: [] starting
12/07/17 11:02:04 INFO flow.Flow: [] source:
MemorySourceTap["MemorySourceScheme[[UNKNOWN]->[ALL]]"]["/ecde3a83-eb07-4d96-b779-7a88ef84d223"]"]
12/07/17 11:02:04 INFO flow.Flow: [] sink:
Hfs["SequenceFile[[UNKNOWN]->['?word',
'?count']]"]["/tmp/cascalog_reserved/f0c4e82d-4e91-4108-bb45-330826edc40e/9d748dbb-3e3b-4c8f-a3a8-bd0e1a3ff855"]"]
12/07/17 11:02:04 INFO flow.Flow: [] parallel execution is enabled: false
12/07/17 11:02:04 INFO flow.Flow: [] starting jobs: 1
12/07/17 11:02:04 INFO flow.Flow: [] allocating threads: 1
12/07/17 11:02:04 INFO flow.FlowStep: [] starting step: (1/1)
...3b-4c8f-a3a8-bd0e1a3ff855
12/07/17 11:02:04 INFO flow.Flow: [] stopping all jobs
12/07/17 11:02:04 INFO flow.FlowStep: [] stopping: (1/1)
...3b-4c8f-a3a8-bd0e1a3ff855
12/07/17 11:02:04 INFO flow.Flow: [] stopped all jobs
12/07/17 11:02:04 INFO flow.Flow: [] shutting down job executor
12/07/17 11:02:04 INFO flow.Flow: [] shutdown complete
Many thanks,
A