Strict Standards: Non-static method JLoader::import() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 29

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 71

Strict Standards: Non-static method JLoader::import() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 32

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 71

Strict Standards: Non-static method JLoader::load() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 161

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 138

Strict Standards: Non-static method JRequest::clean() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 33

Strict Standards: Non-static method JRequest::_cleanArray() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/environment/request.php on line 463

Strict Standards: Non-static method JRequest::_cleanArray() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/environment/request.php on line 464

Strict Standards: Non-static method JRequest::_cleanArray() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/environment/request.php on line 465

Strict Standards: Non-static method JRequest::_cleanArray() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/environment/request.php on line 466

Strict Standards: Non-static method JRequest::_cleanArray() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/environment/request.php on line 467

Strict Standards: Non-static method JRequest::_cleanArray() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/environment/request.php on line 468

Strict Standards: Non-static method JLoader::import() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 35

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 71

Strict Standards: Non-static method JLoader::import() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 38

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 71

Strict Standards: Non-static method JLoader::import() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 39

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 71

Strict Standards: Non-static method JLoader::load() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 161

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 138

Strict Standards: Non-static method JLoader::import() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 46

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 71

Strict Standards: Non-static method JLoader::import() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 47

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 71

Strict Standards: Non-static method JLoader::import() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 50

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 71

Strict Standards: Non-static method JLoader::import() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 53

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 71

Strict Standards: Non-static method JLoader::import() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 54

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/loader.php on line 71

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 57

Strict Standards: Non-static method JLoader::register() should not be called statically in /home/cr780415/public_html/citym.ro/libraries/joomla/import.php on line 58
Newsfeeds
 
Newsfeeds
Planet MySQL
Planet MySQL - https://planet.mysql.com

  • Fun with Bugs #94 - On MySQL Bug Reports I am Subscribed to, Part XXVIII
    I may get a chance to speak about proper bugs processing for open source projects later this year, so I have to keep reviewing recent MySQL bugs to be ready for that. In my previous post in this series I listed some interesting MySQL bug reports created in December, 2019. Time to move on to January, 2020! Belated Happy New Year of cool MySQL Bugs!As usual I mostly care about InnoDB, replication and optimizer bugs and explicitly mention bug reporter by name and give link to his other active reports (if any). I also pick up examples of proper (or improper) reporter and Oracle engineers attitudes. Here is the list: Bug #98103 - "unexpected behavior while logging an aborted query in the slow query log".  Query that was killed while waiting for the table metadata lock is not only get logged, but also lock wait time is saved as query execution time. I'd like to highlight how bug reporter, Pranay Motupalli, used gdb to study what really happens in the code in this case. Perfect bug report! Bug #98113 - "Crash possible when load & unload a connection handler". The (quite obvious) bug was verified based on code review, but only after some efforts were spent by Oracle engineer on denial to accept the problem and its importance. This bug was reported by Fangxin Flou. Bug #98132 - "Analyze table leads to empty statistics during online rebuild DDL ". Nice addition to my collections! This bug with a nice and clear test case was reported by Albert Hu, who also suggested a fix. Bug #98139 - "Committing a XA transaction causes a wrong sequence of events in binlog". This bug reported by Dehao Wang was verified as a "documentation" one, but I doubt documenting current behavior properly is an acceptable fix. Bug reporter suggested to commit in the binary log first, for example. Current implementation that allows users to commit/rollback a XA transaction by using another connection if the former connection is closed or killed, is risky. A lot of arguing happened in comments in the process, and my comment asking for a clear quote from the manual:Would you be so kind to share some text from this page you mentioned:https://dev.mysql.com/doc/refman/8.0/en/xa.htmlor any other fine MySQL 8 manual page stating that XA COMMIT is NOT supported when executed from session/connection/thread other than those prepared the XA transaction? I am doing something wrong probably, but I can not find such text anywhere.was hidden. Let's see what happens to this bug report next. Bug #98211 - "Auto increment value didn't reset correctly.". Not sure what this bug reported by Zhao Jianwei has to do with "Data Types", IMHO it's more about DDL or data dictionary. Again, some sarcastic comments from Community users were needed to put work on this bug back on track... Bug #98220 - "with log_slow_extra=on Errno: info not getting updated correctly for error". This bug was reported by lalit Choudhary from Percona. Bug #98227 - "innodb_stats_method='nulls_ignored' and persistent stats get wrong cardinalities". I think category is wrong for this bug. It's a but in InnoDB's persistent statistics implementation, one of many. The bug was reported by Agustín G from Percona. Bug #98231 - "show index from a partition table gets a wrong cardinality value". Yet another by report by Albert Hu. that ended up as a "documentation" bug for now, even though older MySQL versions provided better cardinality estimations than MySQL 8.0 in this case (so this is a regression of a kind). I hope the bug will be re-classified and properly processed later. Bug #98238 - "I_S.KEY_COLUMN_USAGE is very slow". I am surprised to see such a bug in MySQL 8. According to the bug reporter, Manuel Mausz, this is also a kind of regression comparing to older MySQL version, where these queries used to run faster. Surely, no "regression" tag in this case was added. Bug #98284 - "Low sysbench score in the case of a large number of connections". This notable performance regression of MySQL 8 vs 5.7 was reported by zanye zjy. perf profiling pointed out towards ppoll() where a lot of time is spent. There is a fix suggested by Fangxin Flou (to use poll() instead), but the bug is still "Open". Bug #98287 - "Explanation of hash joins is inconsistent across EXPLAIN formats". This bug was reported by Saverio M and ended up marked as a duplicate of Bug #97299 fixed in upcoming 8.0.20. Use EXPLAIN FORMAT=TREE in the meantime to see proper information about hash joins usage in the plan. Bug #98288 - "xa commit crash lead mysql replication error". This bug report from Phoenix Zhang (who also suggested a patch) was declared a duplicate of Bug #76233 - "XA prepare is logged ahead of engine prepare" (that I've already discussed among other XA transactions bugs here). Bug #98324 - "Deadlocks more frequent since version 5.7.26". Nice regression bug report by Przemyslaw Malkowski from Percona, with additional test provided later by Stephen Wei . Interestingly enough, test results shared by Umesh Shastry show that MySQL 8.0.19 is affected in the same way as 5.7.26+, but 8.0.19 is NOT listed as one of versions affected. This is a mistake to fix, along with missing regression tag. Bug #98427 - "InnoDB FullText AUX Tables are broken in 8.0". Yet another regression in MySQL 8 was found by Satya Bodapati. Change in default collation for utf8mb4 character set caused this it seems. InnoDB FULLTEXT search was far from perfect anyway... The are clouds in the sky of MySQL bugs processing. To summarize:  Still too much time and efforts are sometimes spent on arguing with bug reporter instead of accepting and processing bugs properly. This is unfortunate. Sometimes bugs are wrongly classified when verified (documentation vs code bug, wrong category, wrong severity, not all affected versions are listed, ignoring regression etc). This is also unfortunate. Percona engineers still help to make MySQL better. There are some fixes in upcoming MySQL 8.0.20 that I am waiting for :) XA transactions in MySQL are badly broken (they are not atomic in storage engine + binary log) and hardly safe to use in reality.

  • 3 Step Migration of MySQL data to Clickhouse for faster analytics.
    Recently one of our client approach Mydbops with Query slowness on a MySQL environment . They deployed the new code for generate the huge reports for the year end analytics data . After the deployment the queries were extremely slow and they struggled lot , then they approached us for the solution. After the analysis, their OLAP database as expected it was IO bound with 100% disk IOPS utilised during the report generation. So, the queries were starving for the Disk IO slows the process . Problem statement : Reports are majorly focused on two larger log tables ( emp_Report_model , emp_details ) . The report generator (procedure) is using the count(*) statement to stimulate the aggregated data on each call. It is required for their business purpose . Count(*) is terribly slow in MySQL ( Using MySQL 5.7 ) as it needs to count all the rows in the table . ( MySQL 8.0 has Innodb parallel read threads that can make count(*) faster ) MySQL INDEX can’t help as we are aggregating the complete data ( 90% of data on each call ) the queries will be a Full Table Scan (FTS). Then on further analysis it is found it is only a INSERT workload on those tables. There is no UPDATE’s or DELETE’s on those tables . we proposed a solution to overcome the problem with the help of Clickhouse and migrating the data to Clickhouse. What is Clickhouse ? ClickHouse is an open source column-oriented database management system capable of real time generation of analytical data reports using SQL queries. Clickhouse Website The major limitation on MySQL to Clickhouse replication is we can only apply the INSERT statements (append) from the MySQL. Clickhouse will not support for UPDATE’s and DELETE’s as a columnar database it makes sense. Clickhouse Installation : The installation is quite straight forward. The steps are available in Clickhouse official web site, yum install rpm-build epel-release curl -s https://packagecloud.io/install/repositories/altinity/clickhouse/script.rpm.sh | sudo bash yum install -y mysql-community-devel python34-devel python34-pip gcc python-devel libevent-devel gcc-c++ kernel-devel libxslt-devel libffi-devel openssl-devel python36 python36-devel python36-libs python36-tools Clickhouse Server yum install -y clickhouse-server clickhouse-client Clickhouse MySQL replication Library pip3 install clickhouse-mysql Clickhouse startup : [root@mydbopslabs192 ~]# /etc/init.d/clickhouse-server statusclickhouse-server service is stopped[root@mydbopslabs192 ~]#[root@mydbopslabs192 ~]# /etc/init.d/clickhouse-server startStart clickhouse-server service: /etc/init.d/clickhouse-server: line 166: ulimit: open files: cannot modify limit: Operation not permittedPath to data directory in /etc/clickhouse-server/config.xml: /var/lib/clickhouse/DONE[root@mydbopslabs192 ~]#[root@mydbopslabs192 ~]# /etc/init.d/clickhouse-server statusclickhouse-server service is running [root@mydbopslabs192 ~]# clickhouse-client ClickHouse client version 19.17.4.11. Connecting to localhost:9000 as user default. Connected to ClickHouse server version 19.17.4 revision 54428. mydbopslabs192 show databases; SHOW DATABASES ┌─name────┐ │ default │ │ system │ └─────────┘ 2 rows in set. Elapsed: 0.003 sec. It is all set with installation , Next step i need to migrate the data from MySQL to Clickhouse and configure the replication for those tables . Data Migration from MySQL to Clickhouse : Step 1 ( Dump the Clickhouse based schema structure from MySQL ) : First thing i need to migrate the MySQL tables structure to Clickhouse . MySQL and Clickhouse having different data types . So, we cannot apply the same structure from MySQL to Clickhouse . The below document providing the neat comparison between MySQL and Clickhouse data types . https://shinguz.ch/blog/clickhouse-data-types-compared-to-mysql-data-types/ Let us convert table structure from MySQL to Clickhouse using the clickhouse-mysql tool. [root@mydbopslabs192 ~]# clickhouse-mysql --src-host=192.168.168.191 --src-user=clickhouse --src-password=Click@321 --create-table-sql-template --with-create-database --src-tables=data_Analytics.emp_Report_model,data_Analytics.emp_details > data_Reports_Jan21st.sql 2020-01-21 09:03:40,150/1579597420.150730:INFO:Starting 2020-01-21 09:03:40,150/1579597420.150977:DEBUG:{'app': {'binlog_position_file': None, 'config_file': '/etc/clickhouse-mysql/clickhouse-mysql.conf', 'create_table_json_template': False, 2020-01-21 09:03:40,223/1579597420.223511:DEBUG:Connect to the database host=192.168.168.191 port=3306 user=clickhouse password=Click@321 db=data_Analytics 2020-01-21 09:03:40,264/1579597420.264610:DEBUG:Connect to the database host=192.168.168.191 port=3306 user=clickhouse password=Click@321 db=data_Analytics Dumping the table structure , [root@mydbopslabs192 ~]# less data_Reports_Jan12th.sql | grep CREATECREATE DATABASE IF NOT EXISTS data_Analytics;CREATE TABLE IF NOT EXISTS data_Analytics.emp_details (CREATE DATABASE IF NOT EXISTS data_Analytics;CREATE TABLE IF NOT EXISTS data_Analytics.emp_Report_model ( [root@mydbopslabs192 ~]# cat data_Reports_Jan12th.sql | head -n7 CREATE DATABASE IF NOT EXISTS data_Analytics; CREATE TABLE IF NOT EXISTS data_Analytics.emp_details ( WatchID Nullable(String), JavaEnable Nullable(Int32), Title Nullable(String), GoodEvent Nullable(Int32), EventTime Nullable(DateTime), Step 2 ( Import the schema structure into Clickhouse ) : [root@mydbopslabs192 ~]# clickhouse-client -mn < data_Reports_Jan12th.sql [root@mydbopslabs192 ~]# [root@mydbopslabs192 ~]# fg clickhouse-client mydbopslabs192 use data_Analytics USE data_Analytics Ok. 0 rows in set. Elapsed: 0.001 sec. mydbopslabs192 show tables; SHOW TABLES ┌─name─────────────┐ │ emp_Report_model │ │ emp_details │ └──────────────────┘ 2 rows in set. Elapsed: 0.003 sec. Step 3 ( Migrating the data and keep replication sync ) : Before configure the replication , the MySQL server should be configured with the below variables . Mandatory MySQL settings : server-id = <your id>binlog_format = ROWbinlog_row_image = FULL Now, we can configure the replication in two ways , Migrate the existing data , then configure the replication Migrate the existing data and configure the replication in one step i) Migrate the existing data , then configure the replication : Commands to migrating the existing data [root@mydbopslabs192 ~]# clickhouse-mysql --src-host=192.168.168.191 --src-user=clickhouse --src-password=Click@321 --migrate-table --src-tables=data_Analytics.emp_Report_model --dst-host=127.0.0.1 --dst-schema data_Analytics --dst-table emp_Report_model --log-file=emp_Report_model.log [root@mydbopslabs192 ~]# less emp_Report_model.log | grep -i migra'migrate_table': True,'table_migrator': {'clickhouse': {'connection_settings': {'host': '127.0.0.1',2020-01-21 11:04:57,744/1579604697.744533:INFO:List for migration:2020-01-21 11:04:57,744/1579604697.744947:INFO:Start migration data_Analytics.emp_Report_model2020-01-21 11:04:57,891/1579604697.891935:INFO:migrate_table. sql=SELECT WatchID,JavaEnable,Title,GoodEvent,EventTime,Eventdate,CounterID,ClientIP,ClientIP6,RegionID,UserID,CounterClass,OS,UserAgent,URL,Referer,URLDomain,RefererDomain,Refresh,IsRobot,RefererCategories,URLCategories,URLRegions,RefererRegions,ResolutionWidth,ResolutionHeight,ResolutionDepth,FlashMajor,FlashMinor,FlashMinor2,NetMajor,NetMinor,UserAgentMajor,UserAgentMinor,CookieEnable,JavascriptEnable,IsMobile,MobilePhone,MobilePhoneModel,Params,IPNetworkID,TraficSourceID,SearchEngineID,SearchPhrase,AdvEngineID,IsArtifical,WindowClientWidth,WindowClientHeight,ClientTimeZone,ClientEventTime,SilverlightVersion1,SilverlightVersion2,SilverlightVersion3,SilverlightVersion4,PageCharset,CodeVersion,IsLink,IsDownload,IsNotBounce,FUniqID,HID,IsOldCounter,IsEvent,IsParameter,DontCountHits,WithHash,HitColor,UTCEventTime,Age,Sex,Income,Interests,Robotness,GeneralInterests,RemoteIP,RemoteIP6,WindowName,OpenerName,HistoryLength,BrowserLanguage,BrowserCountry,SocialNetwork,SocialAction,HTTPError,SendTiming,DNSTiming,ConnectTiming,ResponseStartTiming,ResponseEndTiming,FetchTiming,RedirectTiming,DOMInteractiveTiming,DOMContentLoadedTiming,DOMCompleteTiming,LoadEventStartTiming,LoadEventEndTiming,NSToDOMContentLoadedTiming,FirstPaintTiming,RedirectCount,SocialSourceNetworkID,SocialSourcePage,ParamPrice,ParamOrderID,ParamCurrency,ParamCurrencyID,GoalsReached,OpenstatServiceName,OpenstatCampaignID,OpenstatAdID,OpenstatSourceID,UTMSource,UTMMedium,UTMCampaign,UTMContent,UTMTerm,FromTag,HasGCLID,RefererHash,URLHash,CLID,YCLID,ShareService,ShareURL,ShareTitle,IslandID,RequestNum,RequestTry FROM data_Analytics.emp_Report_model mydbopslabs192 select count(*) from data_Analytics.emp_Report_model; ┌─count()─┐ │ 8873898 │ └─────────┘ 1 rows in set. Elapsed: 0.005 sec. configuring the replication , [root@mydbopslabs192 ~]# clickhouse-mysql --src-host=192.168.168.191 --src-user=clickhouse --src-password=Click@321 --src-tables=data_Analytics.emp_Report_model --dst-host=127.0.0.1 --dst-schema data_Analytics --dst-table emp_Report_model --src-resume --src-wait --nice-pause=1 --log-level=info --csvpool --mempool-max-flush-interval=60 --mempool-max-events-num=1000 --pump-data --src-server-id=100 --log-file=emp_Report_model_Replication.log 2020-01-21 11:22:18,974/1579605738.974186:INFO:CSVWriter() self.path=/tmp/csvpool_1579605738.9738157_d643efe5-5ae0-47df-8504-40f61f2c139f.csv 2020-01-21 11:22:18,976/1579605738.976613:INFO:CHCSWriter() connection_settings={'port': 9000, 'host': '127.0.0.1', 'password': '', 'user': 'default'} dst_schema=data_Analytics dst_table=emp_Report_model 2020-01-21 11:22:18,976/1579605738.976936:INFO:starting clickhouse-client process 2020-01-21 11:22:19,160/1579605739.160906:INFO:['data_Analytics.emp_Report_model'] 2020-01-21 11:22:19,166/1579605739.166096:INFO:['data_Analytics.emp_Report_model'] 2020-01-21 11:22:19,170/1579605739.170744:INFO:['data_Analytics.emp_Report_model'] (END) ii) Migrate the existing data and configure the replication in one step : Here we need to define the flag –migrate-table with the replication command . [root@mydbopslabs192 ~]# clickhouse-mysql --src-host=192.168.168.191 --src-user=clickhouse --src-password=Click@321 --src-tables=data_Analytics.emp_Report_model --dst-host=127.0.0.1 --dst-schema data_Analytics --dst-table emp_Report_model --src-resume --src-wait --nice-pause=1 --log-level=info --csvpool --mempool-max-flush-interval=60 --mempool-max-events-num=1000 --pump-data --src-server-id=100 --migrate-table --log-file=emp_Report_model_replication_mig.log [root@mydbopslabs192 ~]# less emp_Report_model_replication_mig.log | grep -i mig2020-01-21 11:27:53,263/1579606073.263505:INFO:List for migration:2020-01-21 11:27:53,263/1579606073.263786:INFO:Start migration data_Analytics.emp_Report_model2020-01-21 11:27:53,316/1579606073.316788:INFO:migrate_table. sql=SELECT WatchID,JavaEnable,Title,GoodEvent,EventTime,Eventdate,CounterID,ClientIP,ClientIP6,RegionID,UserID,CounterClass,OS,UserAgent,URL,Referer,URLDomain,RefererDomain,Refresh,IsRobot,RefererCategories,URLCategories,URLRegions,RefererRegions,ResolutionWidth,ResolutionHeight,ResolutionDepth,FlashMajor,FlashMinor,FlashMinor2,NetMajor,NetMinor,UserAgentMajor,UserAgentMinor,CookieEnable,JavascriptEnable,IsMobile,MobilePhone,MobilePhoneModel,Params,IPNetworkID,TraficSourceID,SearchEngineID,SearchPhrase,AdvEngineID,IsArtifical,WindowClientWidth,WindowClientHeight,ClientTimeZone,ClientEventTime,SilverlightVersion1,SilverlightVersion2,SilverlightVersion3,SilverlightVersion4,PageCharset,CodeVersion,IsLink,IsDownload,IsNotBounce,FUniqID,HID,IsOldCounter,IsEvent,IsParameter,DontCountHits,WithHash,HitColor,UTCEventTime,Age,Sex,Income,Interests,Robotness,GeneralInterests,RemoteIP,RemoteIP6,WindowName,OpenerName,HistoryLength,BrowserLanguage,BrowserCountry,SocialNetwork,SocialAction,HTTPError,SendTiming,DNSTiming,ConnectTiming,ResponseStartTiming,ResponseEndTiming,FetchTiming,RedirectTiming,DOMInteractiveTiming,DOMContentLoadedTiming,DOMCompleteTiming,LoadEventStartTiming,LoadEventEndTiming,NSToDOMContentLoadedTiming,FirstPaintTiming,RedirectCount,SocialSourceNetworkID,SocialSourcePage,ParamPrice,ParamOrderID,ParamCurrency,ParamCurrencyID,GoalsReached,OpenstatServiceName,OpenstatCampaignID,OpenstatAdID,OpenstatSourceID,UTMSource,UTMMedium,UTMCampaign,UTMContent,UTMTerm,FromTag,HasGCLID,RefererHash,URLHash,CLID,YCLID,ShareService,ShareURL,ShareTitle,IslandID,RequestNum,RequestTry FROM data_Analytics.emp_Report_model [root@mydbopslabs192 ~]# less emp_Report_model_replication_mig.log | grep -i process 2020-01-21 11:28:01,071/1579606081.071054:INFO:starting clickhouse-client process Validating count post inserting some records in MySQL (Source) mydbopslabs192 select count(*) from data_Analytics.emp_Report_model; ┌─count()─┐│ 8873900 │└─────────┘ MySQL to Clickhouse replication is working as expected . Performance comparison for OLAP workload ( MySQL vs Clickhouse ) : Count(*) in MySQL : mysql> select count(*) from emp_Report_model; 1 row in set (32.68 sec) Count(*) in clickhouse : mydbopslabs192 select count(*) from emp_Report_model; 1 rows in set. Elapsed: 0.007 sec. Aggregated query in MySQL : mysql> select emp_Report_model.WatchID,emp_Report_model.JavaEnable,emp_Report_model.Title,emp_Report_model.RegionID from emp_Report_model inner join emp_details on emp_Report_model.WatchID=emp_details.WatchID and emp_Report_model.RegionID=emp_details.RegionID and emp_Report_model.UserAgentMajor=emp_details.UserAgentMajor where emp_Report_model.SocialSourcePage is not null and emp_details.FetchTiming != 0 order by emp_Report_model.WatchID; 292893 rows in set (1 min 2.61 sec) Aggregated query in Clickhouse : mydbopslabs192 select emp_Report_model.WatchID,emp_Report_model.JavaEnable,emp_Report_model.Title,emp_Report_model.RegionID from emp_Report_model inner join emp_details on emp_Report_model.WatchID=emp_details.WatchID and emp_Report_model.RegionID=emp_details.RegionID and emp_Report_model.UserAgentMajor=emp_details.UserAgentMajor where emp_Report_model.SocialSourcePage is not null and emp_details.FetchTiming != 0 order by emp_Report_model.WatchID; 292893 rows in set. Elapsed: 1.710 sec. Processed 9.37 million rows, 906.15 MB (7.75 million rows/s., 749.15 MB/s.) Yes, Clickhouse is performing very well with COUNT(*) and analytical queries . Query Model MySQL Clickhouse count(*) 33 seconds 0.1 seconds OLAP Query 63 seconds 1.7 seconds MySQL Vs ClickhouseThe above graph is just a pictorial representation of queries tested. Though Clickhouse excels in analytics workload it has it own limitations too. Now we have another happy customer at Mydbops who gets his analytics dashboard faster now. Featured image credits Stephen Dawson on Unsplash

  • Galera Replication flow Architecture
    Galera is the best solution for High Availability, It is being used by many peoples world wide . Galera is doing synchronous replication ( really it is Certification based replication ) to keep update the data on group nodes . In this blog I have explained about “How the Galera replication works?” . For the better understanding, I have made an architecture diagram to describe the replication flow . I have also provided the explanation for the key words which has used in the architecture diagram . Architecture flow Diagram : What is writeset ? Writeset contains all changes made to the database by the transaction and append_key of the changed rows . What is append_key ? Append_key registers the key of the changed data by the transaction. The key for rows can be represented in three parts as DATABASE NAME, TABLE NAME, PRIMARY KEY . If the table don’t have the PRIMARY KEY, the HASH of the modified data will be the part of the writeset . What is Certification in Galera ? Certification in Galera will be performed to detect the conflicts and the data consistency among the group . It will be performed before the transaction comiit . What is CVV ( Central Certification Vector ) ? CVV is used to detect the conflcits . The modified keys will added in to the Central Certification Vector. If the added key is already part of the vector, then conflict resolution checks are triggered. Hope this blog will helps someone, who is working with Galera Cluster . I will be come up with my next blog soon . Thanks !!!

  • Requirements on NDB Cluster 8.0
    In this blog I am going to go through the most important requirements thatNDB Cluster 8.0 is based on. I am going to also list a number of consequencesthese requirements have on the product and what it supports.One slideshare.net I uploaded a presentation of the NDB Cluster 8.0requirements. In this blog and several accompanying I am going to present thereasoning that these requirements led to in terms of software architecture, datastructures and so forth.The requirements on NDB Cluster 8.0 is the following:1) Unavailability of less than 30 seconds per year (Class 6 Availability)2) Predictable latency3) Transparent Distribution and Replication4) Write and Read Scalability5) Highest availability even with 2 replicas6) Support SQL, LDAP, File System interface, ...7) Mixed OLTP and OLAP for real-time data analysis8) Follow HW development for CPUs, networks, disks and memories9) Follow HW development in Cloud SetupsThe original requirements of NDB Cluster was to only support Class 5 Availability.Telecom providers have continued supporting even higher number of subscribers pertelecom database and thus driving the requirements to even be Class 6 Availability.NDB Cluster have more than 15 years proven track record of handling Class 6Availability.The requirements on predictable latency means that we need to be able to handletransactions involving around twenty operations within 10 milliseconds even whenthe cluster is working at a high load.To make sure that application development is easy we opted for a model wheredistribution and replication is transparent from the application code. This means thatNDB Cluster is one of very few DBMSs that support auto-sharding requirements.High Write Scalability has been a major requirement in NDB from day one.NDB Cluster can handle tens of million transactions per second, most competingDBMS products that are based on replication protocols can only handletens of thousands of transactions per second.We used an arbitration model to avoid the requirements of 3 replicas, withNDB Cluster 8.0 we fully support 3 and 4 replicas as well, but even with 2 replicaswe get the same availability as competing products based on replication protocolsrequire 3 replicas for.The original requirements on NDB didn't include a SQL interface. An efficientAPI was much more important for telecom applications. However when meetingcustomers of a DBMS it was obvious that an SQL interface was needed.So this requirement was added in the early 2000s. However most early users ofNDB Cluster still opted for a more direct API and this means that NDB Clustertoday have LDAP interfaces through OpenLDAP, file system interface throughHopsFS and a lot of products that use the NDB API (C++), ClusterJ (Java) andan NDB NodeJS API.The model of development for NDB makes it possible to also handle complex queriesin an efficient manner. Thus in development of NDB Cluster 8.0 we added therequirement to better support also OLAP use cases of the OLTP data that is stored inNDB Cluster. We have already made very significant improvements in this area bysupporting parallelised filters and to a great extent parallelisation of join processingin the NDB Data Nodes. This is an active development area for the cominggenerations of NDB Cluster.NDB Cluster started its development in the 1990s. Already in this development wecould foresee some of the HW development that was going to happen. The producthas been able to scale as HW have been more and more scalable. Today this means thateach node in NDB Cluster can scale to 64 cores, data nodes can scale to 16 TB ofmemory and at least 100 TB of disk data and can benefit greatly from higher andhigher bandwidth on the network.Finally modern deployments often happen in cloud environments. Clouds are basedon an availability model with regions, availability domains and failure domains.Thus NDB Cluster software needs to make it possible to make efficient use oflocality in the HW configurations.

  • Original NDB Cluster Requirements
    NDB Cluster was originally developed for Network DataBases in the telecomnetwork. I worked in a EU project between 1991 and 1995 that focused ondeveloping a pre-standardisation effort on UMTS that later became standardisedunder the term 3G. I worked in a part of the project where we focused onsimulating the network traffic in such a 3G network. I was focusing my attentionespecially on the requirements that this created on a network databasein the telecom network.In the same time period I also dived deeply into research literatures about DBMSimplementation.The following requirements from the 3G studies emerged as the most important:1) Class 5 Availability (less than 5 minutes of unavailability per year)2) High Write Scalability as well as High Read Scalability3) Predictable latency down to milliseconds4) Efficient API5) Failover in crash scenarios within seconds or even subseconds with a real-time OSIn another blog on the influences leading to the use of an asynchronous programmingmodel in NDB Cluster we derive the following requirements on the softwarearchitecture.1) Fail-fast architecture (implemented through ndbrequire macro in NDB)2) Asynchronous programming (provides much tracing information in crashes)3) Highly modular SW architecture4) JAM macros to track SW in crash eventsIn another blog I present the influences leading to NDB Cluster using a sharednothing model.One important requirement that NDB Cluster is fairly unique in addressing is highwrite scalability. Most DBMSs solves this by grouping together large amounts ofsmall transactions to make commits more efficient. This means that most DBMSshave a very high cost of committing a transaction.Modern replicated protocols actually have even made this worse. As an example inmost modern replicated protocols all transactions have to commit in a serial fashion.This means that commit handling is a major bottleneck in many modern DBMSs.Often this limits their transaction rates to tens of thousands commits per second.NDB Cluster went another path and essentially commits every single row changeseparate from any other row change. Thus the cost of executing 1000 transactionswith 1000 operations per transaction is exactly the same as the cost of executing1 million single row transactions.To achieve the grouping we used the fact that we are working in an asynchronousenvironment. Thus we used several levels of piggybacking of messages. One of themost important things here is that one socket is used to transport many thousands ofsimultaneous database transactions. With NDB Cluster 8.0.20 we use multiple socketsbetween data nodes and this scales another 10-20x to ensure that HW limitations isthe bottleneck and not the NDB software.The asynchronous programming model ensures that we can handle thousands ofoperations each millisecond and that changing from working on one transaction toanother is a matter of tens to hundreds of nanoseconds. In addition we can handlethese transactions independently in a number of different data nodes and evenwithin different threads within the same data node. Thus we can handle tens of millionstransactions per second even within a single data node.The protocol we used for this is a variant of the two-phase commit protocol withsome optimisations based on the linear two-phase commit protocol. However therequirements on Class 5 Availability meant that we had to solve the blocking partof the two-phase commit protocol. We solved this by recreating the state of thefailed transaction coordinators in a surviving node as part of failover handling.This meant that we will never be blocked by a failure as long as there is still asufficient amount of nodes to keep the cluster operational.



Powered by Joomla!. Designed by: feel the music template symfony Valid XHTML and CSS.