Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fact to Fact JOINS where Querying more than 1 Fact table #9209

Open
ElisabethPA opened this issue Feb 11, 2025 · 0 comments
Open

Fact to Fact JOINS where Querying more than 1 Fact table #9209

ElisabethPA opened this issue Feb 11, 2025 · 0 comments
Labels
question The issue is a question. Please use Stack Overflow for questions.

Comments

@ElisabethPA
Copy link

TWO Fact tables are joined in the same query indirectly, while there is no Fact-to-Fact JOIN defined in example .js file .

We are migrating from Microsoft multidimensional OLAP to CubeDev. Our backend data is in star schema format. We have many facts and dimensions and we are not able to merge fact tables together. They have different granularities. In OLAP cubes we are able to query two measures from different facts in the same MDX query without any problem, while in CubeDev we have a major stopper.

The structure of SQL query generated by CubeDev is slowing down the performance, because we have 1 Fact table join with common dimension + 2nd Fact table joined with common dimension. Both of these queries are combined using JOIN as well. In practice this is generating a cross join between tables, that potentially for us have millions of rows.

Is there any work-around to avoid having LEFT join used in a simple queries where more than one FACT table is involved?
Here is our example:

Tables used:

  1. FactSales
  2. DimProduct
  3. FactTargets

Query we tried from our side:

{
"limit": 5000,
"measures": [
"FactSales.totalSales",
"FactTargets.totalTargets"
],
"dimensions": [
"DimProduct.productName"
]
}

SQL Query that is generated involves LEFT JOIN on Fact tables, which slows down the performance when tables will have more than millions rows.

The problem for us is the inner query, which shows that two fact tables in example above are cross-joined. Is there any way to avoid it?

**Example of .js file here:**

cube(`FactSales`, {
  sql: `
    SELECT * FROM (VALUES
      (1, 1001, 500, '2024-02-01'),
      (2, 1002, 700, '2024-02-02'),
      (3, 1001, 800, '2024-02-03'),
      (4, 1003, 600, '2024-02-04')
    ) AS t(sale_id, product_id, sales_value, sale_date)
  `,

  joins: {
    DimProduct: {
      sql: `${CUBE}.product_id = ${DimProduct}.product_id`,
      relationship: `belongsTo`
    }
  },

  measures: {
    totalSales: {
      type: `sum`,
      sql: `sales_value`
    }
  },

  dimensions: {
    saleId: {
      sql: `sale_id`,
      type: `number`,
      primaryKey: true
    },
    productId: {
      sql: `product_id`,
      type: `string`
    },
    saleDate: {
      sql: `sale_date`,
      type: `time`
    }
  }
});

cube(`FactTargets`, {
  sql: `
    SELECT * FROM (VALUES
      (1, 1001, 1500, '2024-02'),
      (2, 1002, 1200, '2024-02'),
      (3, 1003, 1000, '2024-02')
    ) AS t(target_id, product_id, target_value, target_month)
  `,

  joins: {
    DimProduct: {
      sql: `${CUBE}.product_id = ${DimProduct}.product_id`,
      relationship: `belongsTo`
    }
  },

  measures: {
    totalTargets: {
      type: `sum`,
      sql: `target_value`
    }
  },

  dimensions: {
    targetId: {
      sql: `target_id`,
      type: `number`,
      primaryKey: true
    },
    productId: {
      sql: `product_id`,
      type: `string`
    },
    targetMonth: {
      sql: `target_month`,
      type: `string`
    }
  }
});

cube(`DimProduct`, {
  sql: `
    SELECT * FROM (VALUES
      (1001, 'Laptop', 'Electronics'),
      (1002, 'Phone', 'Electronics'),
      (1003, 'Desk', 'Furniture')
    ) AS t(product_id, product_name, category)
  `,

  joins: {
    FactSales: {
      sql: `${CUBE}.product_id = ${FactSales}.product_id`,
      relationship: `hasMany`
    }
  },

  dimensions: {
    productId: {
      sql: `product_id`,
      type: `string`,
      primaryKey: true
    },
    productName: {
      sql: `product_name`,
      type: `string`
    },
    category: {
      sql: `category`,
      type: `string`
    }
  }
});
@ElisabethPA ElisabethPA added the question The issue is a question. Please use Stack Overflow for questions. label Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question The issue is a question. Please use Stack Overflow for questions.
Projects
None yet
Development

No branches or pull requests

1 participant