FAQ
Hi,

I just learned about JCascalog thanks to MEAP. I thought it was a
spelling error first. But I now love the goals.
https://github.com/nathanmarz/cascalog/wiki/JCascalog

Of course, it is "naturally more verbose" in java but I would wager
the API could be improved thanks to :
* removal of 'new' and class names
* use of varargs
* use of static import
* use of method chaining
* use of type checking
* use of IDE completion

=> 1st proposal (first step) :

Api.execute(
new StdoutTap(),
new Subquery(new Fields("?person"),
new Predicate(Playground.AGE, new Fields("?person", 25))
));

could become

query("?person", bind(Playground.AGE).to("?person",
25)).writeTo(sdout());

and

Api.execute(
new StdoutTap(),
new Subquery(new Fields("?person", "?age", "?double-age"),
new Predicate(Playground.AGE, new Fields("?person", "?age")),
new Predicate(new Multiply(), new Fields("?age", 2), new Fields("?
double-age"))
));

could become

query(fields("?person", "?age", "?double-age"),
bind(Playground.AGE).to("?person", "?age"),
mult("?age", 2).to("?double-age")).writeTo(stdout());

Very generic but fields() need to be used when more than one field is
needed and there is no completion on common predicate.

=> 2nd proposal (the best?) :

Api.execute(
new StdoutTap(),
new Subquery(new Fields("?person"),
new Predicate(Playground.AGE, new Fields("?person", 25))
));

could become

query("?person")
.bind(Playground.AGE).to("?person", 25).
.writeTo(sdout());

and

Api.execute(
new StdoutTap(),
new Subquery(new Fields("?person", "?age", "?double-age"),
new Predicate(Playground.AGE, new Fields("?person", "?age")),
new Predicate(new Multiply(), new Fields("?age", 2), new Fields("?
double-age"))
));

could become

query("?person", "?age", "?double-age")
.bind(Playground.AGE).to("?person", 25).
mult("?age", 2).to("?double-age")
.writeTo(sdout());

Love this one better but it will a bit harder for adding new
predicates : interfaces should be used to enforce the use of to() for
non boolean predicates. With a good API, it should not be a problem
though.

=> so what?
Of course, this post is not only about the idea. I am volunteering for
doing it. But I really would like to have a feedback on it before. I
might have overlooked big issues. I might not be able to do it before
the end of this month but for next month it should be doable.

Regards

Bertrand

Search Discussions

  • Nathan Marz at May 9, 2012 at 6:52 pm
    This is a really good idea. I think I'd prefer a slightly different API
    though:

    query("?person", "?age", "?double-age")
    .pred(PLAYGROUND.AGE, "?person", "?age")
    .pred(new GT(), "?age", 2));

    or for word count:

    query("?word", "?count")
    .pred(PLAYGROUND.SENTENCE, "?sentence")
    .pred(new Split(), "?sentence").out("?word")
    .pred(new Count(), "?count"));

    The idea being that the fields you provide in pred are:
    1) "default fields" (to be interpreted by Cascalog depending on pred type)
    if there's no .out clause
    2) input fields if there's an out clause

    I don't think it's good for there to be things like "mult" in the API, as
    all operations should be treated equally.

    I also don't like having "writeTo" being part of the fluent query API.
    Execution should be able to take in many tap, query pairs. So something
    like this would be better:

    execute(stdout(), query1, othertap, query2);

    Like Cascalog, we can make a wrapper for the case of defining and executing
    a query at the same time:

    execute_query(stdout(), "?word", "count")
    .pred(PLAYGROUND.SENTENCE, "?sentence")
    .pred(new Split(), "?sentence").out("?word")
    .pred(new Count(), "?count"));


    Bertrand, if you want to take a crack at this open up an issue on Github
    where we can discuss the design more. This would be an awesome contribution.


    On Wed, May 9, 2012 at 10:51 AM, Bertrand Dechoux wrote:

    Hi,

    I just learned about JCascalog thanks to MEAP. I thought it was a
    spelling error first. But I now love the goals.
    https://github.com/nathanmarz/cascalog/wiki/JCascalog

    Of course, it is "naturally more verbose" in java but I would wager
    the API could be improved thanks to :
    * removal of 'new' and class names
    * use of varargs
    * use of static import
    * use of method chaining
    * use of type checking
    * use of IDE completion

    => 1st proposal (first step) :

    Api.execute(
    new StdoutTap(),
    new Subquery(new Fields("?person"),
    new Predicate(Playground.AGE, new Fields("?person", 25))
    ));

    could become

    query("?person", bind(Playground.AGE).to("?person",
    25)).writeTo(sdout());

    and

    Api.execute(
    new StdoutTap(),
    new Subquery(new Fields("?person", "?age", "?double-age"),
    new Predicate(Playground.AGE, new Fields("?person", "?age")),
    new Predicate(new Multiply(), new Fields("?age", 2), new Fields("?
    double-age"))
    ));

    could become

    query(fields("?person", "?age", "?double-age"),
    bind(Playground.AGE).to("?person", "?age"),
    mult("?age", 2).to("?double-age")).writeTo(stdout());

    Very generic but fields() need to be used when more than one field is
    needed and there is no completion on common predicate.

    => 2nd proposal (the best?) :

    Api.execute(
    new StdoutTap(),
    new Subquery(new Fields("?person"),
    new Predicate(Playground.AGE, new Fields("?person", 25))
    ));

    could become

    query("?person")
    .bind(Playground.AGE).to("?person", 25).
    .writeTo(sdout());

    and

    Api.execute(
    new StdoutTap(),
    new Subquery(new Fields("?person", "?age", "?double-age"),
    new Predicate(Playground.AGE, new Fields("?person", "?age")),
    new Predicate(new Multiply(), new Fields("?age", 2), new Fields("?
    double-age"))
    ));

    could become

    query("?person", "?age", "?double-age")
    .bind(Playground.AGE).to("?person", 25).
    mult("?age", 2).to("?double-age")
    .writeTo(sdout());

    Love this one better but it will a bit harder for adding new
    predicates : interfaces should be used to enforce the use of to() for
    non boolean predicates. With a good API, it should not be a problem
    though.

    => so what?
    Of course, this post is not only about the idea. I am volunteering for
    doing it. But I really would like to have a feedback on it before. I
    might have overlooked big issues. I might not be able to do it before
    the end of this month but for next month it should be doable.

    Regards

    Bertrand



    --
    Twitter: @nathanmarz
    http://nathanmarz.com
  • Bertrand Dechoux at May 9, 2012 at 9:26 pm
    I tried my 2nd proposal here (I couldn't resist)
    https://github.com/BertrandDechoux/cascalog/blob/fluent-jcascalog/src/jvm/jcascalog/fluent/Demo.java

    I will open an issue later taking into account your answer so that we
    can iterate on it.

    Regards

    Bertrand
    On May 9, 8:52 pm, Nathan Marz wrote:
    This is a really good idea. I think I'd prefer a slightly different API
    though:

    query("?person", "?age", "?double-age")
    .pred(PLAYGROUND.AGE, "?person", "?age")
    .pred(new GT(), "?age", 2));

    or for word count:

    query("?word", "?count")
    .pred(PLAYGROUND.SENTENCE, "?sentence")
    .pred(new Split(), "?sentence").out("?word")
    .pred(new Count(), "?count"));

    The idea being that the fields you provide in pred are:
    1) "default fields" (to be interpreted by Cascalog depending on pred type)
    if there's no .out clause
    2) input fields if there's an out clause

    I don't think it's good for there to be things like "mult" in the API, as
    all operations should be treated equally.

    I also don't like having "writeTo" being part of the fluent query API.
    Execution should be able to take in many tap, query pairs. So something
    like this would be better:

    execute(stdout(), query1, othertap, query2);

    Like Cascalog, we can make a wrapper for the case of defining and executing
    a query at the same time:

    execute_query(stdout(), "?word", "count")
    .pred(PLAYGROUND.SENTENCE, "?sentence")
    .pred(new Split(), "?sentence").out("?word")
    .pred(new Count(), "?count"));

    Bertrand, if you want to take a crack at this open up an issue on Github
    where we can discuss the design more. This would be an awesome contribution.

    On Wed, May 9, 2012 at 10:51 AM, Bertrand Dechoux wrote:








    Hi,
    I just learned about JCascalog thanks to MEAP. I thought it was a
    spelling error first. But I now love the goals.
    https://github.com/nathanmarz/cascalog/wiki/JCascalog
    Of course, it is "naturally more verbose" in java but I would wager
    the API could be improved thanks to :
    * removal of 'new' and class names
    * use of varargs
    * use of static import
    * use of method chaining
    * use of type checking
    * use of IDE completion
    => 1st proposal (first step) :
    Api.execute(
    new StdoutTap(),
    new Subquery(new Fields("?person"),
    new Predicate(Playground.AGE, new Fields("?person", 25))
    ));
    could become
    query("?person", bind(Playground.AGE).to("?person",
    25)).writeTo(sdout());
    and
    Api.execute(
    new StdoutTap(),
    new Subquery(new Fields("?person", "?age", "?double-age"),
    new Predicate(Playground.AGE, new Fields("?person", "?age")),
    new Predicate(new Multiply(), new Fields("?age", 2), new Fields("?
    double-age"))
    ));
    could become
    query(fields("?person", "?age", "?double-age"),
    bind(Playground.AGE).to("?person", "?age"),
    mult("?age", 2).to("?double-age")).writeTo(stdout());
    Very generic but fields() need to be used when more than one field is
    needed and there is no completion on common predicate.
    => 2nd proposal (the best?) :
    Api.execute(
    new StdoutTap(),
    new Subquery(new Fields("?person"),
    new Predicate(Playground.AGE, new Fields("?person", 25))
    ));
    could become
    query("?person")
    .bind(Playground.AGE).to("?person", 25).
    .writeTo(sdout());
    and
    Api.execute(
    new StdoutTap(),
    new Subquery(new Fields("?person", "?age", "?double-age"),
    new Predicate(Playground.AGE, new Fields("?person", "?age")),
    new Predicate(new Multiply(), new Fields("?age", 2), new Fields("?
    double-age"))
    ));
    could become
    query("?person", "?age", "?double-age")
    .bind(Playground.AGE).to("?person", 25).
    mult("?age", 2).to("?double-age")
    .writeTo(sdout());
    Love this one better but it will a bit harder for adding new
    predicates : interfaces should be used to enforce the use of to() for
    non boolean predicates. With a good API, it should not be a problem
    though.
    => so what?
    Of course, this post is not only about the idea. I am volunteering for
    doing it. But I really would like to have a feedback on it before. I
    might have overlooked big issues. I might not be able to do it before
    the end of this month but for next month it should be doable.
    Regards
    Bertrand
    --
    Twitter: @nathanmarzhttp://nathanmarz.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcascalog-user @
categoriesclojure, hadoop
postedMay 9, '12 at 5:51p
activeMay 9, '12 at 9:26p
posts3
users2
websiteclojure.org
irc#clojure

2 users in discussion

Bertrand Dechoux: 2 posts Nathan Marz: 1 post

People

Translate

site design / logo © 2022 Grokbase