为什么选择TimescaleDB而非NoSQL?

Compared to general NoSQL databases (e.g., MongoDB, Cassandra) or even more specialized time-oriented ones (e.g., InfluxDB, KairosDB), TimescaleDB provides both qualitative and quantitative differences:

  • Normal SQL​: TimescaleDB gives you the power of standard SQL queries on time-series data, even at scale. Most (all?) NoSQL databases require learning either a new query language or using something that's at best "SQL-ish" (which still breaks compatibility with existing tools).
  • Operational simplicity​: With TimescaleDB, you only need to manage one database for your relational and time-series data. Otherwise, users often need to silo data into two databases: a "normal" relational one, and a second time-series one.
  • JOINs​ can be performed across relational and time-series data.
  • Query performance​ is faster for a varied set of queries. More complex queries are often slow or full table scans on NoSQL databases, while some databases can't even support many natural queries.
  • ​Manage like PostgreSQL​ and inherit its support for varied datatypes and indexes (B-tree, hash, range, BRIN, GiST, GIN).
  • ​Native support for geospatial data​: Data stored in TimescaleDB can leverage PostGIS's geometric datatypes, indexes, and queries.
  • Third-party tools​: TimescaleDB supports anything that speaks SQL, including BI tools like Tableau.

与一般的NoSQL(如 MongoDB, Cassandra),甚至与面向时间的NoSQL(e.g., InfluxDB, KairosDB)相比,TimescaleDB提供了定性和定量的差异:

  • 标准SQL:针对时序数据,TimescaleDB提供了标准SQL查询的能力,即使在规模上也是如此。大多数(或者所有)NoSQL数据库需要学习新的查询语言,或者最好使用“SQL-ish”(它仍与现有工具不兼容)。
  • 操作简单:使用TimescaleDB,您只需要管理一个数据库来处理关系和时间序列数据。否则,您通常需要将数据存储到两个数据库中:一个是“正常的”关系数据库,另一个是时间序列数据库。
  • 可以在关系数据和时序数据之间使用连接(JOINs。
  • 针对一组不同的查询来说,TimescaleDB速度更快。在NoSQL数据库中,复杂的查询通常比较缓慢,或者是整表查询,但是一些数据库不支持自然查询。
  • 像PostgreSQL一样管理,并继承它对各种数据类型和索引的支持(B-tree, hash, range, BRIN, GiST, GIN)。
  • 对地理空间数据的本地支持:TimescaleDB中存储的数据可以利用PostGIS的几何数据类型、索引和查询。
  • 第三方工具:TimescaleDB支持任何使用SQL的工具,包括BI工具,例如Tableau。

什么时候使用TimescaleDB?

Then again, if any of the following is true, you might not want to use TimescaleDB:

  • ​Simple read requirements​: If you simply want fast key-value lookups or single column rollups, an in-memory or column-oriented database might be more appropriate. The former clearly does not scale to the same data volumes, however, while the latter's performance significantly underperforms for more complex queries.
  • Very sparse or unstructured data​: While TimescaleDB leverages PostgreSQL support for JSON/JSONB formats and handles sparsity quite efficiently (bitmaps for NULL values), schema-less architectures may be more appropriate in certain scenarios.
  • Heavy compression is a priority​: Benchmarks show TimescaleDB running on ZFS getting around 4x compression, but compression-optimized column stores might be more appropriate for higher compression rates.
  • ​Infrequent or offline analysis​: If slow response times are acceptable (or fast response times limited to a small number of pre-computed metrics), and if you don't expect many applications/users to access that data concurrently, you might avoid using a database at all and instead just store data in an distributed file system.

可以在下列情形中使用TimescaleDB:

  • 简单查询需求:如果你只需要快速进行键值查询,或者单列查询,那么使用一个内存数据库或者一个面向列的数据库会更合适。然而,前者显然不能扩展到相同的数据量(扩展性低),后者在负责查询中性能低。
  • 稀疏数据或非结构化数据:虽然TimescaleDB可以利用PostgreSQL有效地支持JSON/JSONB格式数据,并能够高效处理稀疏数据(为NULL值的位图),但在某些场景中,无模式的架构可能更合适。

results matching ""

    No results matching ""