Skip to content

[SPARK-56505][SQL][TESTS] SharedSparkSession provides sql.SparkSession, add SharedClassicSparkSession, ClassicQueryTest#55356

Draft
fwc wants to merge 32 commits into
apache:masterfrom
fwc:sharedsparksession-return-abstract-sess
Draft

[SPARK-56505][SQL][TESTS] SharedSparkSession provides sql.SparkSession, add SharedClassicSparkSession, ClassicQueryTest#55356
fwc wants to merge 32 commits into
apache:masterfrom
fwc:sharedsparksession-return-abstract-sess

Conversation

@fwc
Copy link
Copy Markdown

@fwc fwc commented Apr 15, 2026

What changes were proposed in this pull request?

The PR adds SharedClassicSparkSession and ClassicSQLTestUtils helpers that provide/use classic.SparkSession and then changes SharedSparkSession to provide only an abstract sql.SparkSession.

The PR is split in the following sections:

  1. Add small refactors that reduce usage of classic-only APIs / prepare the main change
  2. Adding placeholders for ClassicSQLTestUtils and SharedClassicSparkSession
  3. Replace SQLTestUtils and SharedSparkSession with their Classic-subtraits wherever needed
  4. Move classic-only stuff to the Classic-subtraits, only provide abstract session in SharedSparkSession

Why are the changes needed?

For historical reasons, many tests use SharedSparkSession, even though they do not necessarily are strictly classic-only.
This change "lifts" SharedSparkSession to be more abstract, so that potentially some test traits/classes could be extend in such a way that a connect.SparkSession is supplied.

The added SharedClassicSparkSession and ClassicSQLTestUtils provide the needed helpers for classic-only suites.

Does this PR introduce any user-facing change?

No. This is test-only and mostly type-level.

How was this patch tested?

Existing tests.

Was this patch authored or co-authored using generative AI tooling?

The initial patch was 'hand-written'. The patch was then rebased onto the 'SQLTestUtils merged into QueryTest' changes via claude code.

@zhengruifeng zhengruifeng changed the title [WIP] SharedSparkSession provides sql.SparkSession, add SharedClassicSparkSession, ClassicSQLTestUtils [SPARK-56505][SQL][TESTS] SharedSparkSession provides sql.SparkSession, add SharedClassicSparkSession, ClassicSQLTestUtils Apr 16, 2026
@zhengruifeng
Copy link
Copy Markdown
Contributor

@fwc We are going to merge SQLTestUtils into QueryTest, and then make SQLTestUtils just a alias to QueryTest.

* Use this trait to indicate that this test is classic-only, i.e. it would not make sense to run
* this test with a [[org.apache.spark.sql.connect.SparkSession]].
*/
trait SharedClassicSparkSession extends SharedSparkSession with ClassicSQLTestUtils {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering should it just maxin with a trait like ClassicSparkSessionProvider

@zhengruifeng zhengruifeng requested a review from cloud-fan April 16, 2026 03:35
@fwc fwc force-pushed the sharedsparksession-return-abstract-sess branch 2 times, most recently from c86f3b7 to 6f9618e Compare April 18, 2026 17:31
@fwc fwc force-pushed the sharedsparksession-return-abstract-sess branch from fe547c8 to 166639e Compare May 19, 2026 19:26
Matthis Gördel and others added 23 commits May 20, 2026 17:20
The sql.DatasetHolder will be used by the SQLTestUtils.testImplicits
instead of the classic.DatasetHolder, which will be used in
ClassicSQLTestUtils.classicTestImplicts.
I think too many tests rely on this to move it to ClassicSQLTestUtils.
I guess we'd need some (test-only?!) extension to connect to make this
work.
I think too many tests rely on this to move it to ClassicSQLTestUtils.
I guess we'd need some (test-only?!) extension to connect to make this
work.
Does this block?! If not: test-only connect extension?
…super

super (i.e. SharedSparkSession) will only provide sql.SparkSession.
I think there'd be a way to not require the cast when
SharedClassicSparkSession extends initialization logic.
I'm kinda unsure what is more hacky.
fwc and others added 8 commits May 20, 2026 17:21
…flict resolution

Seven files had upstream additions (PATH tests, Changelog CDC tests,
NullIf tests, null-aware anti join test, etc.) accidentally dropped
when the rebase used `git checkout --theirs` to resolve conflicts.
Restore master content and re-apply only the SharedClassicSparkSession
/ classicTestImplicits changes.

Co-authored-by: Isaac
SQLTestUtils was merged into QueryTest upstream, so this trait should
extend and be named after QueryTest, not the deprecated alias.

Co-authored-by: Isaac
Mechanical rename across all call sites.

Co-authored-by: Isaac
@fwc fwc force-pushed the sharedsparksession-return-abstract-sess branch from 29c0a9a to d4ae708 Compare May 20, 2026 17:21
@fwc fwc changed the title [SPARK-56505][SQL][TESTS] SharedSparkSession provides sql.SparkSession, add SharedClassicSparkSession, ClassicSQLTestUtils [SPARK-56505][SQL][TESTS] SharedSparkSession provides sql.SparkSession, add SharedClassicSparkSession, ClassicQueryTest May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants