apache/doris

[Good First Issue] Support All SQL Functions in Other SQL System

Open

#48,203 opened on Feb 22, 2025

View on GitHub
 (131 comments) (0 reactions) (0 assignees)Groovy (14,108 stars) (3,531 forks)batch import
doris-futuregood first issue

Description

UPDATE:

Hey guys, considering the current capabilities of LLMs, especially for this specific set of cases, we can already generate PRs that meet quality and requirement standards within minutes using the agents available on Github. Therefore, we will no longer invest a lot of effort into this. If you're interested in implementing these cases, I recommend you have in-depth discussions with AI, fully understand the code you need to submit, and then directly submit the PR. PRs involving new functions will automatically request my review, and I will check them in a timely manner. I will no longer reply to specific comments under this issue.

Thanks everyone.


Description

We plan to implement all SQL functions in other famous DBs, like MySQL, PG, Trino, CK, Hive, and more. Facilitate the users to migrate to Doris. They're very suitable for newcomers as your first Doris PR. So here's the list. Feel free to comment to pick anyone! If one is picked, I will tick it.

Part I. Hive

  • sinh(from Trino), asinh, atanh, acosh (Easy) @ChenMiaoi
  • context_ngrams @noixcn
  • factorial (Easy) @K-handle-Y
  • levenshtein @whisper33z
  • encode, decode @hacklu-tu
  • soundex

See the newest Hive document for these functions' explanation.

Part II. Spark

  • map_concat (been taken for interview @HappenLee )
  • regexp_extract_all for the third argument

Part III. Trino&Presto

  • regexp_count
  • regexp_position @lsy3993
  • hamming_distance (better with levenshtein together)
  • human_readable_seconds
  • timezone_hour, timezone_minute @om2805
  • GEO FUNCTIONS
    • ST_GeomFromKML
    • ST_Equals, ST_Relate
    • ST_Intersects, ST_Disjoint, ST_Touches @koi2000
    • ST_Crosses, ST_Overlaps, ST_Relate, ST_Within
    • ST_Buffer, ST_Boundary, ST_Envelope, ST_EnvelopeAsPts, ST_ExteriorRing
    • geometry_nearest_points, geometry_union, ST_Union
    • ST_Difference, ST_Intersection, ST_SymDifference
    • ST_Centroid, ST_ConvexHull
    • ST_CoordDim, ST_Dimension
    • ST_Distance, ST_GeometryType, ST_Length @zxc20041
    • ST_InteriorRingN, ST_InteriorRings, ST_NumInteriorRing
    • ST_GeometryType, ST_IsClosed, ST_IsEmpty, ST_IsSimple, ST_IsRing, ST_IsValid
    • ST_PointN, ST_StartPoint, ST_EndPoint, ST_Points, ST_XMax, ST_XMin, ST_YMax, ST_YMin
    • simplify_geometry
    • ST_NumGeometries, ST_Geometries, ST_NumPoints
  • ARRAY FUNCTIONS
    • dot_product @meox3259
    • trim_array @vajaw
    • ngrams @advisedy
    • combinations @daju233
    • reduce (lambda function) @cypppper
    • sort (add the three arguments with lambda functor version) (Hard)
  • merge(HLL)
  • typeof
  • Aggregation Functions
    • bool_or, bool_and

Part IV. DuckDB

  • Math Functions
    • even @wumeibanfa
    • gcd, lcm @wumeibanfa
    • gamma @Patinlove
    • signbit @wumeibanfa
  • String Functions
    • ord @CAICAIIs
  • Vector(Array) Functions
    • cross_product @juruo-c
    • cosine_similarity @Pluto340
  • Date Functions
    • century @robll-v1
  • Aggregation Functions
    • geomean @0AyanamiRei
    • entropy @wrlcke
    • sem @wumeibanfa
    • skew_pop, kurt_pop @mickaelli
  • Map Functions
    • map_contains_entry @DayuanX
    • map_entries @DayuanX

Part V. MySQL (High Priority)

Others

  • json_search with 4th and 5th arguments like MySQL @ChenMiaoi

More tasks is coming...

Solution

All the guidelines to implement an SQL function are in https://github.com/apache/doris/issues/48201. Please take a carefully look at!

Contributor guide