Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#3016 New expand PPL command #3305

Draft
wants to merge 81 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
f462e2a
Add flatten command to ANTLR lexer and parser.
currantw Jan 17, 2025
69f0b1a
Skeleton implementation, tests, and documents with lots of TODOs.
currantw Jan 20, 2025
c1ac737
Initial implementation
currantw Jan 20, 2025
366e162
Fix typo
currantw Jan 24, 2025
0cbd8d4
Initial implementation
currantw Jan 27, 2025
26e9443
Update/fix tests.
currantw Jan 27, 2025
237b69e
Update integration tests to align with doc tests.
currantw Jan 31, 2025
3981c38
Minor cleanup.
currantw Jan 28, 2025
2ca7194
Add `ExplainIT` tests for flatten
currantw Jan 28, 2025
9ddfc4a
Revert recursive flattening, add documentation, more test updates
currantw Jan 28, 2025
c54c1f5
One more doctest fix
currantw Jan 28, 2025
8993e11
Fix `ExplainIT` error
currantw Jan 28, 2025
288add2
Add additional test case to `flatten.rst`
currantw Jan 28, 2025
eca3154
Fix `FlattenCommandIT`, add additional test case.
currantw Jan 28, 2025
c89a302
Fix `PhysicalPlanNodeVisitor` test coverage.
currantw Jan 28, 2025
9b2e9ce
Review: use `StringUtils.format` instead of `String.format`.
currantw Jan 29, 2025
82c8ccb
Fix `LogicalFlattenTest`.
currantw Jan 29, 2025
b7d8794
Simplify algorithm for `Analyzer`.
currantw Jan 29, 2025
ca013ef
Update to support flattening nested structs.
currantw Jan 30, 2025
7920bd8
Fix unrelated bug in `IPFUnctionsTest`.
currantw Jan 30, 2025
9d6459f
Update `IPFUnctionsTest` to anchor at start.
currantw Jan 30, 2025
6d040eb
Minor cleanup.
currantw Jan 30, 2025
43c0902
Fix doctest formatting.
currantw Jan 30, 2025
40362bf
Address minor review comments.
currantw Jan 30, 2025
b0a6710
Fix doc tests.
currantw Jan 31, 2025
be26660
Update integratation tests to align with doc tests.
currantw Jan 31, 2025
b3e4401
Review - minor documentation updates.
currantw Jan 31, 2025
4099f10
Remove double periods
currantw Feb 1, 2025
b96cefa
Add comment on `Map.equals`.
currantw Feb 1, 2025
72d98ed
Remove unnecessary error checks.
currantw Feb 1, 2025
4632c03
Update to maintain existing field.
currantw Feb 3, 2025
d755208
Update for test coverage
currantw Feb 3, 2025
09563ab
Simplify `Analyzer` implementation
currantw Feb 3, 2025
1d391ce
Rename `cities` dataset to `flatten`
currantw Feb 5, 2025
ef750f4
SpotlessApply
currantw Feb 5, 2025
14e005e
Minor doc cleanup.
currantw Feb 5, 2025
73885a7
Fix failing IT
currantw Feb 5, 2025
4fbd320
Update incorrect documentation in `Analyzer.visitFlatten`.
currantw Feb 5, 2025
337fb01
Update integ and doc tests to add another example of original field b…
currantw Feb 6, 2025
abe5c6c
Review comment - move example to `Analyzer.visitFlatten` Javadoc.
currantw Feb 6, 2025
a0022f4
Review comment - update `Analyzer.visitFlatten` Javadoc to specify th…
currantw Feb 6, 2025
df99d37
Review comment - remove unnecessary @Getter
currantw Feb 6, 2025
6883214
Review comments - add `testStructNestedDeep` test case
currantw Feb 6, 2025
94a4c8a
Review comments - add `testStructNestedDeep` test case
currantw Feb 6, 2025
26563c9
Woops! Fix failing test.
currantw Feb 6, 2025
bfb51a5
Review comments - extract `PathUtils` constants
currantw Feb 6, 2025
22eccaf
Review comments - update `Analyzer` to not use `Optional`.
currantw Feb 7, 2025
dcd241a
Bunch of additional review comments.
currantw Feb 7, 2025
befe55b
Spotless
currantw Feb 7, 2025
eb93cb1
Spotless
currantw Feb 7, 2025
1f05e85
Additional review comments, including move constants to `ExprValueUti…
currantw Feb 7, 2025
db96c51
Review comments - update tests for exception msg
currantw Feb 7, 2025
c1666ee
Review comments - simplify `FlattenOperator.flattenExprValueAtPath`.
currantw Feb 7, 2025
6e176a3
Change braces in documentation.
currantw Feb 10, 2025
ab5a2fe
Initial implementation of skeleton classes and methods.
currantw Feb 3, 2025
e7e5a5a
Implement some of the `expand` logic
currantw Feb 4, 2025
a8f6855
Add `PathUtils` and unit tests.
currantw Feb 4, 2025
cf357ea
Update `ExpandOperator` and unit tests.
currantw Feb 5, 2025
c260f6b
Implement integration tests, update `Expand` logic, rename data set.
currantw Feb 5, 2025
0d3c33c
Implement integration tests, update `Expand` logic, rename data set.
currantw Feb 5, 2025
209326d
Add `expand.rst` documentation and further updates to tests/implement…
currantw Feb 5, 2025
dd9a024
Unrelated typo fix
currantw Feb 5, 2025
e31df37
Cleanup, modify to return `null` for an empty array.
currantw Feb 6, 2025
274719d
Fix `test_docs.py` typo, order alphabetically.
currantw Feb 10, 2025
459ca24
Minor cleanup, mostly alphabetizing constants.
currantw Feb 10, 2025
eef907f
Add new doc and integration tests
currantw Feb 11, 2025
77afac7
Fix missing test coverage
currantw Feb 11, 2025
49078e9
Move `PathUtils` to `ExprValueUtils` and update tests.
currantw Feb 11, 2025
ff0d5ae
Use `ExprValueUtils` to simplify `FlattenOperator`
currantw Feb 11, 2025
790104c
Simplify and make consistent flatten and expand operators.
currantw Feb 12, 2025
ec47f8f
Update `ExprValueUtils` and unit tests.
currantw Feb 12, 2025
101b653
Make constants private
currantw Feb 12, 2025
ec0f44f
Spotless
currantw Feb 12, 2025
da7738a
Add hashCode unit tests
currantw Feb 12, 2025
4284f6f
Trivial documentation cleanup.
currantw Feb 12, 2025
1e51881
Fix doctest
currantw Feb 12, 2025
c4f732e
General cleanup, combine flatten and expand datasets.
currantw Feb 12, 2025
582fdaa
Spotless
currantw Feb 12, 2025
e8494f2
Fix failing doctests
currantw Feb 12, 2025
028e074
Fix `ExplainIT` test
currantw Feb 13, 2025
01569ae
Handle empty collections.
currantw Feb 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
207 changes: 176 additions & 31 deletions core/src/main/java/org/opensearch/sql/analysis/Analyzer.java

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
package org.opensearch.sql.analysis;

import java.util.List;
import org.opensearch.sql.data.model.ExprValueUtils;
import org.opensearch.sql.datasource.DataSourceService;

public class DataSourceSchemaIdentifierNameResolver {
Expand All @@ -21,8 +22,6 @@ public class DataSourceSchemaIdentifierNameResolver {
private final String identifierName;
private final DataSourceService dataSourceService;

private static final String DOT = ".";

/**
* Data model for capturing dataSourceName, schema and identifier from fully qualifiedName. In the
* current state, it is used to capture DataSourceSchemaTable name and DataSourceSchemaFunction in
Expand All @@ -35,7 +34,7 @@ public DataSourceSchemaIdentifierNameResolver(
DataSourceService dataSourceService, List<String> parts) {
this.dataSourceService = dataSourceService;
List<String> remainingParts = captureSchemaName(captureDataSourceName(parts));
identifierName = String.join(DOT, remainingParts);
identifierName = ExprValueUtils.joinQualifiedName(remainingParts);
}

public String getIdentifierName() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,11 @@
import org.opensearch.sql.ast.tree.CloseCursor;
import org.opensearch.sql.ast.tree.Dedupe;
import org.opensearch.sql.ast.tree.Eval;
import org.opensearch.sql.ast.tree.Expand;
import org.opensearch.sql.ast.tree.FetchCursor;
import org.opensearch.sql.ast.tree.FillNull;
import org.opensearch.sql.ast.tree.Filter;
import org.opensearch.sql.ast.tree.Flatten;
import org.opensearch.sql.ast.tree.Head;
import org.opensearch.sql.ast.tree.Kmeans;
import org.opensearch.sql.ast.tree.Limit;
Expand Down Expand Up @@ -103,10 +105,18 @@ public T visitRelationSubquery(RelationSubquery node, C context) {
return visitChildren(node, context);
}

public T visitExpand(Expand node, C context) {
return visitChildren(node, context);
}

public T visitTableFunction(TableFunction node, C context) {
return visitChildren(node, context);
}

public T visitFlatten(Flatten node, C context) {
return visitChildren(node, context);
}

public T visitFilter(Filter node, C context) {
return visitChildren(node, context);
}
Expand Down
10 changes: 10 additions & 0 deletions core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,10 @@
import org.opensearch.sql.ast.tree.Aggregation;
import org.opensearch.sql.ast.tree.Dedupe;
import org.opensearch.sql.ast.tree.Eval;
import org.opensearch.sql.ast.tree.Expand;
import org.opensearch.sql.ast.tree.FillNull;
import org.opensearch.sql.ast.tree.Filter;
import org.opensearch.sql.ast.tree.Flatten;
import org.opensearch.sql.ast.tree.Head;
import org.opensearch.sql.ast.tree.Limit;
import org.opensearch.sql.ast.tree.Parse;
Expand Down Expand Up @@ -104,6 +106,14 @@ public static Eval eval(UnresolvedPlan input, Let... projectList) {
return new Eval(Arrays.asList(projectList)).attach(input);
}

public Expand expand(UnresolvedPlan input, Field field) {
return new Expand(field).attach(input);
}

public Flatten flatten(UnresolvedPlan input, Field field) {
return new Flatten(field).attach(input);
}

public static UnresolvedPlan projectWithArg(
UnresolvedPlan input, List<Argument> argList, UnresolvedExpression... projectList) {
return new Project(Arrays.asList(projectList), argList).attach(input);
Expand Down
2 changes: 0 additions & 2 deletions core/src/main/java/org/opensearch/sql/ast/tree/Eval.java
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,12 @@
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.RequiredArgsConstructor;
import lombok.Setter;
import lombok.ToString;
import org.opensearch.sql.ast.AbstractNodeVisitor;
import org.opensearch.sql.ast.expression.Let;

/** AST node represent Eval operation. */
@Getter
@Setter
@ToString
@EqualsAndHashCode(callSuper = false)
@RequiredArgsConstructor
Expand Down
40 changes: 40 additions & 0 deletions core/src/main/java/org/opensearch/sql/ast/tree/Expand.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.ast.tree;

import com.google.common.collect.ImmutableList;
import java.util.List;
import lombok.Getter;
import lombok.RequiredArgsConstructor;
import lombok.ToString;
import org.opensearch.sql.ast.AbstractNodeVisitor;
import org.opensearch.sql.ast.expression.Field;

/** AST node representing an {@code expand <field>} operation. */
@Getter
@ToString
@RequiredArgsConstructor
public class Expand extends UnresolvedPlan {

private UnresolvedPlan child;
@Getter private final Field field;

@Override
public Expand attach(UnresolvedPlan child) {
this.child = child;
return this;
}

@Override
public List<UnresolvedPlan> getChild() {
return ImmutableList.of(child);
}

@Override
public <T, C> T accept(AbstractNodeVisitor<T, C> nodeVisitor, C context) {
return nodeVisitor.visitExpand(this, context);
}
}
39 changes: 39 additions & 0 deletions core/src/main/java/org/opensearch/sql/ast/tree/Flatten.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.ast.tree;

import com.google.common.collect.ImmutableList;
import java.util.List;
import lombok.Getter;
import lombok.RequiredArgsConstructor;
import lombok.ToString;
import org.opensearch.sql.ast.AbstractNodeVisitor;
import org.opensearch.sql.ast.expression.Field;

/** AST node representing a {@code flatten <field>} operation. */
@ToString
@RequiredArgsConstructor
public class Flatten extends UnresolvedPlan {

private UnresolvedPlan child;
@Getter private final Field field;

@Override
public Flatten attach(UnresolvedPlan child) {
this.child = child;
return this;
}

@Override
public List<UnresolvedPlan> getChild() {
return ImmutableList.of(child);
}

@Override
public <T, C> T accept(AbstractNodeVisitor<T, C> nodeVisitor, C context) {
return nodeVisitor.visitFlatten(this, context);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ public String getAlias() {
}

/**
* Get Qualified name preservs parts of the user given identifiers. This can later be utilized to
* Get Qualified name preserves parts of the user given identifiers. This can later be utilized to
* determine DataSource,Schema and Table Name during Analyzer stage. So Passing QualifiedName
* directly to Analyzer Stage.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

import java.time.temporal.TemporalAmount;
import java.time.temporal.TemporalUnit;
import java.util.Objects;
import lombok.RequiredArgsConstructor;
import org.opensearch.sql.data.type.ExprCoreType;
import org.opensearch.sql.data.type.ExprType;
Expand Down Expand Up @@ -56,4 +57,9 @@ public TemporalUnit unit() {
.findAny()
.orElse(interval.getUnits().get(0));
}

@Override
public int hashCode() {
return Objects.hashCode(interval);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
package org.opensearch.sql.data.model;

import inet.ipaddr.IPAddress;
import java.util.Objects;
import org.opensearch.sql.data.type.ExprCoreType;
import org.opensearch.sql.data.type.ExprType;
import org.opensearch.sql.utils.IPUtils;
Expand Down Expand Up @@ -47,4 +48,9 @@ public String toString() {
public IPAddress ipValue() {
return value;
}

@Override
public int hashCode() {
return Objects.hashCode(value);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@

package org.opensearch.sql.data.model;

import java.util.Iterator;
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.Map.Entry;
Expand Down Expand Up @@ -69,23 +68,20 @@ public ExprValue keyValue(String key) {
*
* @return true for equal, otherwise false.
*/
public boolean equal(ExprValue o) {
if (!(o instanceof ExprTupleValue)) {
@Override
public boolean equal(ExprValue other) {

if (!(other instanceof ExprTupleValue)) {
return false;
} else {
ExprTupleValue other = (ExprTupleValue) o;
Iterator<Entry<String, ExprValue>> thisIterator = this.valueMap.entrySet().iterator();
Iterator<Entry<String, ExprValue>> otherIterator = other.valueMap.entrySet().iterator();
while (thisIterator.hasNext() && otherIterator.hasNext()) {
Entry<String, ExprValue> thisEntry = thisIterator.next();
Entry<String, ExprValue> otherEntry = otherIterator.next();
if (!(thisEntry.getKey().equals(otherEntry.getKey())
&& thisEntry.getValue().equals(otherEntry.getValue()))) {
return false;
}
}
return !(thisIterator.hasNext() || otherIterator.hasNext());
}

/**
* {@link Map#equals} returns true if the two maps' entry sets are equal, and works properly
* across all implementation of the {@link Map} interface. See {@link
* https://docs.oracle.com/javase/8/docs/api/java/util/Map.html#equals-java.lang.Object-} for
* more details.
*/
return valueMap.equals(other.tupleValue());
}
Comment on lines +72 to 85
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change pulled from flatten PR. See discussion there.
#3267


/** Only compare the size of the map. */
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,21 +13,32 @@
import java.time.ZoneOffset;
import java.time.temporal.TemporalAmount;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.regex.Pattern;
import lombok.experimental.UtilityClass;
import org.opensearch.sql.data.type.ExprCoreType;
import org.opensearch.sql.exception.ExpressionEvaluationException;

/** The definition of {@link ExprValue} factory. */
@UtilityClass
public class ExprValueUtils {

// Literal constants
public static final ExprValue LITERAL_TRUE = ExprBooleanValue.of(true);
public static final ExprValue LITERAL_FALSE = ExprBooleanValue.of(false);
public static final ExprValue LITERAL_NULL = ExprNullValue.of();
public static final ExprValue LITERAL_MISSING = ExprMissingValue.of();

/** Qualified name separator string */
private final String QUALIFIED_NAME_SEPARATOR = ".";

/** Pattern that matches the qualified name separator string */
private final Pattern QUALIFIED_NAME_SEPARATOR_PATTERN =
Pattern.compile(QUALIFIED_NAME_SEPARATOR, Pattern.LITERAL);

public static ExprValue booleanValue(Boolean value) {
return value ? LITERAL_TRUE : LITERAL_FALSE;
}
Expand Down Expand Up @@ -200,4 +211,14 @@ public static IPAddress getIpValue(ExprValue exprValue) {
public static Boolean getBooleanValue(ExprValue exprValue) {
return exprValue.booleanValue();
}

/** Splits the given qualified name into components and returns the result. */
public List<String> splitQualifiedName(String qualifiedName) {
return Arrays.asList(QUALIFIED_NAME_SEPARATOR_PATTERN.split(qualifiedName, -1));
}

/** Joins the given components into a qualified name and returns the result. */
public String joinQualifiedName(List<String> components) {
return String.join(QUALIFIED_NAME_SEPARATOR, components);
}
}
19 changes: 19 additions & 0 deletions core/src/main/java/org/opensearch/sql/executor/Explain.java
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@
import org.opensearch.sql.planner.physical.AggregationOperator;
import org.opensearch.sql.planner.physical.DedupeOperator;
import org.opensearch.sql.planner.physical.EvalOperator;
import org.opensearch.sql.planner.physical.ExpandOperator;
import org.opensearch.sql.planner.physical.FilterOperator;
import org.opensearch.sql.planner.physical.FlattenOperator;
import org.opensearch.sql.planner.physical.LimitOperator;
import org.opensearch.sql.planner.physical.NestedOperator;
import org.opensearch.sql.planner.physical.PhysicalPlan;
Expand Down Expand Up @@ -160,6 +162,23 @@ public ExplainResponseNode visitEval(EvalOperator node, Object context) {
ImmutableMap.of("expressions", convertPairListToMap(node.getExpressionList()))));
}

@Override
public ExplainResponseNode visitExpand(ExpandOperator node, Object context) {
return explain(
node,
context,
explainNode -> explainNode.setDescription(ImmutableMap.of("expandField", node.getField())));
}

@Override
public ExplainResponseNode visitFlatten(FlattenOperator node, Object context) {
return explain(
node,
context,
explainNode ->
explainNode.setDescription(ImmutableMap.of("flattenField", node.getField())));
}

@Override
public ExplainResponseNode visitDedupe(DedupeOperator node, Object context) {
return explain(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,14 @@

package org.opensearch.sql.expression;

import static org.opensearch.sql.utils.ExpressionUtils.PATH_SEP;

import java.util.Arrays;
import java.util.List;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.RequiredArgsConstructor;
import org.opensearch.sql.data.model.ExprTupleValue;
import org.opensearch.sql.data.model.ExprValue;
import org.opensearch.sql.data.model.ExprValueUtils;
import org.opensearch.sql.data.type.ExprCoreType;
import org.opensearch.sql.data.type.ExprType;
import org.opensearch.sql.expression.env.Environment;
Expand Down Expand Up @@ -106,7 +105,7 @@ public ExprValue resolve(ExprTupleValue value) {
}

private ExprValue resolve(ExprValue value, List<String> paths) {
ExprValue wholePathValue = value.keyValue(String.join(PATH_SEP, paths));
ExprValue wholePathValue = value.keyValue(ExprValueUtils.joinQualifiedName(paths));
// For array types only first index currently supported.
if (value.type().equals(ExprCoreType.ARRAY)) {
wholePathValue = value.collectionValue().get(0).keyValue(paths.get(0));
Expand Down
Loading
Loading