工具系列:PandasAI介绍

慈云数据 1年前 (2024-03-25) 技术支持 55 0

文章目录

  • PandasAI
    • 设置
    • SmartDataframe
        • 从pandas数据框导入
        • 绘制图表
        • 智能数据湖
        • 不同的LLM
          • LangChain LLMs
          • 连接器

            PandasAI

            PandasAI是一个使数据分析变得富有对话性和有趣的库。它利用pandas数据框和最先进的LLMs的强大功能,让用户以对话方式进行数据分析。

            与pandas所做的类似(10分钟入门pandas -> https://pandas.pydata.org/docs/user_guide/10min.html),我们希望创建最简单的方式来学习如何掌握PandasAI。

            让我们开始吧!

            设置

            要开始使用,我们需要安装最新版本的PandasAI。

            # 安装pandasai库
            !pip install pandasai
            
            Collecting pandasai
              Downloading pandasai-1.2.7-py3-none-any.whl (73 kB)
            [?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/73.2 kB[0m [31m?[0m eta [36m-:--:--[0m
            [2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m71.7/73.2 kB[0m [31m2.2 MB/s[0m eta [36m0:00:01[0m
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m73.2/73.2 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
            [?25hCollecting astor=0.8.1 (from pandasai)
              Downloading astor-0.8.1-py2.py3-none-any.whl (27 kB)
            Requirement already satisfied: duckdb=0.8.1 in /usr/local/lib/python3.10/dist-packages (from pandasai) (0.8.1)
            Collecting ipython=8.13.1 (from pandasai)
              Downloading ipython-8.15.0-py3-none-any.whl (806 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m806.6/806.6 kB[0m [31m15.7 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: matplotlib=3.7.1 in /usr/local/lib/python3.10/dist-packages (from pandasai) (3.7.1)
            Collecting openai=0.27.5 (from pandasai)
              Downloading openai-0.27.10-py3-none-any.whl (76 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: pandas==1.5.3 in /usr/local/lib/python3.10/dist-packages (from pandasai) (1.5.3)
            Requirement already satisfied: pydantic=1 in /usr/local/lib/python3.10/dist-packages (from pandasai) (1.10.12)
            Collecting python-dotenv=1.0.0 (from pandasai)
              Downloading python_dotenv-1.0.0-py3-none-any.whl (19 kB)
            Requirement already satisfied: scipy=1.9.0 in /usr/local/lib/python3.10/dist-packages (from pandasai) (1.11.2)
            Collecting sqlalchemy=1.4.49 (from pandasai)
              Downloading SQLAlchemy-1.4.49-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m43.8 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai) (2.8.2)
            Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai) (2023.3.post1)
            Requirement already satisfied: numpy>=1.21.0 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai) (1.23.5)
            Requirement already satisfied: backcall in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai) (0.2.0)
            Requirement already satisfied: decorator in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai) (4.4.2)
            Collecting jedi>=0.16 (from ipython=8.13.1->pandasai)
              Downloading jedi-0.19.0-py2.py3-none-any.whl (1.6 MB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m63.3 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: matplotlib-inline in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai) (0.1.6)
            Requirement already satisfied: pickleshare in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai) (0.7.5)
            Requirement already satisfied: prompt-toolkit!=3.0.37,=3.0.30 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai) (3.0.39)
            Requirement already satisfied: pygments>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai) (2.16.1)
            Collecting stack-data (from ipython=8.13.1->pandasai)
              Downloading stack_data-0.6.2-py3-none-any.whl (24 kB)
            Requirement already satisfied: traitlets>=5 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai) (5.7.1)
            Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai) (1.1.3)
            Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai) (4.8.0)
            Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai) (1.1.0)
            Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai) (0.11.0)
            Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai) (4.42.1)
            Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai) (1.4.5)
            Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai) (23.1)
            Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai) (9.4.0)
            Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai) (3.1.1)
            Requirement already satisfied: requests>=2.20 in /usr/local/lib/python3.10/dist-packages (from openai=0.27.5->pandasai) (2.31.0)
            Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from openai=0.27.5->pandasai) (4.66.1)
            Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from openai=0.27.5->pandasai) (3.8.5)
            Requirement already satisfied: typing-extensions>=4.2.0 in /usr/local/lib/python3.10/dist-packages (from pydantic=1->pandasai) (4.5.0)
            Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from sqlalchemy=1.4.49->pandasai) (2.0.2)
            Requirement already satisfied: parso=0.8.3 in /usr/local/lib/python3.10/dist-packages (from jedi>=0.16->ipython=8.13.1->pandasai) (0.8.3)
            Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.10/dist-packages (from pexpect>4.3->ipython=8.13.1->pandasai) (0.7.0)
            Requirement already satisfied: wcwidth in /usr/local/lib/python3.10/dist-packages (from prompt-toolkit!=3.0.37,=3.0.30->ipython=8.13.1->pandasai) (0.2.6)
            Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas==1.5.3->pandasai) (1.16.0)
            Requirement already satisfied: charset-normalizer=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai=0.27.5->pandasai) (3.2.0)
            Requirement already satisfied: idna=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai=0.27.5->pandasai) (3.4)
            Requirement already satisfied: urllib3=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai=0.27.5->pandasai) (2.0.4)
            Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai=0.27.5->pandasai) (2023.7.22)
            Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai=0.27.5->pandasai) (23.1.0)
            Requirement already satisfied: multidict=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai=0.27.5->pandasai) (6.0.4)
            Requirement already satisfied: async-timeout=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai=0.27.5->pandasai) (4.0.3)
            Requirement already satisfied: yarl=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai=0.27.5->pandasai) (1.9.2)
            Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai=0.27.5->pandasai) (1.4.0)
            Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai=0.27.5->pandasai) (1.3.1)
            Collecting executing>=1.2.0 (from stack-data->ipython=8.13.1->pandasai)
              Downloading executing-1.2.0-py2.py3-none-any.whl (24 kB)
            Collecting asttokens>=2.1.0 (from stack-data->ipython=8.13.1->pandasai)
              Downloading asttokens-2.4.0-py2.py3-none-any.whl (27 kB)
            Collecting pure-eval (from stack-data->ipython=8.13.1->pandasai)
              Downloading pure_eval-0.2.2-py3-none-any.whl (11 kB)
            Installing collected packages: pure-eval, executing, sqlalchemy, python-dotenv, jedi, asttokens, astor, stack-data, openai, ipython, pandasai
              Attempting uninstall: sqlalchemy
                Found existing installation: SQLAlchemy 2.0.20
                Uninstalling SQLAlchemy-2.0.20:
                  Successfully uninstalled SQLAlchemy-2.0.20
              Attempting uninstall: ipython
                Found existing installation: ipython 7.34.0
                Uninstalling ipython-7.34.0:
                  Successfully uninstalled ipython-7.34.0
            [31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
            google-colab 1.0.0 requires ipython==7.34.0, but you have ipython 8.15.0 which is incompatible.
            ipython-sql 0.5.0 requires sqlalchemy>=2.0, but you have sqlalchemy 1.4.49 which is incompatible.[0m[31m
            [0mSuccessfully installed astor-0.8.1 asttokens-2.4.0 executing-1.2.0 ipython-8.15.0 jedi-0.19.0 openai-0.27.10 pandasai-1.2.7 pure-eval-0.2.2 python-dotenv-1.0.0 sqlalchemy-1.4.49 stack-data-0.6.2
            

            SmartDataframe

            SmartDataframe是一个继承了pd.DataFrame的pandas(或polars)数据框,它除了具有pd.DataFrame的所有属性和方法外,还添加了对话功能。

            # 导入pandasai库中的SmartDataframe类
            from pandasai import SmartDataframe
            

            您可以通过从多个不同的来源实例化一个数据框架(pandas或polars数据框架、csv、xlsx或Google Sheets)。

            从pandas数据框导入

            要从pandas dataframe导入数据,您需要先导入Pandas库并创建一个dataframe。

            # 导入pandas库
            import pandas as pd
            
            # 创建一个DataFrame对象,包含国家、GDP和幸福指数的数据
            df = pd.DataFrame({
                "country": [
                    "United States",  # 美国
                    "United Kingdom",  # 英国
                    "France",  # 法国
                    "Germany",  # 德国
                    "Italy",  # 意大利
                    "Spain",  # 西班牙
                    "Canada",  # 加拿大
                    "Australia",  # 澳大利亚
                    "Japan",  # 日本
                    "China",  # 中国
                ],
                "gdp": [
                    19294482071552,  # GDP数据
                    2891615567872,
                    2411255037952,
                    3435817336832,
                    1745433788416,
                    1181205135360,
                    1607402389504,
                    1490967855104,
                    4380756541440,
                    14631844184064,
                ],
                "happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12],  # 幸福指数数据
            })
            

            由于PandasAI由LLM提供支持,您应该导入您想要用于您的用例的LLM。在这种情况下,我们将使用OpenAI。

            要使用OpenAI,您需要一个API令牌。按照以下简单步骤生成您的API_TOKEN:

            openai。

            1. 访问https://openai.com/api/并使用您的电子邮件地址注册或连接您的Google帐户。
            2. 在个人帐户设置的左侧,点击"View API Keys"。
            3. 选择"Create new Secret key"。

            访问openai的API是一个付费服务。在进行实验之前,请阅读Pricing信息。

            # 导入OpenAI类
            from pandasai.llm import OpenAI
            # 创建一个OpenAI对象,并传入api_token参数
            llm = OpenAI(api_token="YOUR TOKEN")
            

            现在我们已经实例化了LLM,我们终于可以实例化SmartDataframe了。

            # 创建一个SmartDataframe对象,并传入一个DataFrame对象df和一个配置参数config={"llm": llm}
            sdf = SmartDataframe(df, config={"llm": llm})
            

            一个SmartDataframe继承了原始数据框的所有方法和属性。例如:

            # 使用条件筛选,返回country列为'United States'的行
            result = sdf[sdf['country'] == 'United States']
            # 打印结果
            print(result)
            

            但是您也可以用自然语言进行查询。

            # 调用chat函数,参数为"Return the top 5 countries by GDP"
            sdf.chat("Return the top 5 countries by GDP")
            

            # 调用chat函数,并传入一个问题作为参数
            sdf.chat("What's the sum of the gdp of the 2 unhappiest countries?")
            
            # 打印出sdf对象的last_code_generated属性的值
            print(sdf.last_code_generated)
            
            def analyze_data(dfs: list[pd.DataFrame]) ->dict:
                df_combined = pd.concat(dfs)
                df_sorted = df_combined.sort_values('happiness_index')
                sum_gdp = df_sorted.head(2)['gdp'].sum()
                return {'type': 'number', 'value': sum_gdp}
            result = analyze_data(dfs)
            

            绘制图表

            您还可以使用PandasAI轻松绘制图表

            # 调用chat函数,传入参数"Plot a chart of the gdp by country",并输出结果
            sdf.chat("Plot a chart of the gdp by country")
            

            您还可以提供额外的指示。例如,假设您想为每个柱状图使用不同的颜色。您只需要向PandasAI提出要求即可。

            # 使用seaborn库中的chat函数绘制直方图
            # 参数为gdp和country,表示按照国家绘制gdp的直方图
            # 每个直方图的颜色不同
            sdf.chat("Plot a histogram of the gdp by country, using a different color for each bar")
            

            作为一种替代方法,您可以使用shortcuts。快捷方式是一种函数,可以避免您编写提示并在幕后为您执行"魔法"。

            例如,您可以使用.plot_bar_chart()生成相同的图表,提供字段:

            # 绘制柱状图
            sdf.plot_bar_chart(x="country", y="gdp")
            

            因此,例如,如果我们想要将其可视化为饼图,您可以调用plot_pie_chart快捷方式,传递我们想要用作标签的字段和我们想要用作值的字段。

            # 绘制饼图
            sdf.plot_pie_chart(labels="country", values="gdp")
            

            智能数据湖

            有时候,您可能希望同时处理多个数据框,让LLM来协调使用哪一个来回答您的查询。在这种情况下,您应该使用SmartDatalake而不是SmartDataframe。

            这个概念与SmartDataframe非常相似,但是它可以接受多个数据框作为输入,而不仅仅是一个。

            # 导入SmartDatalake模块
            from pandasai import SmartDatalake
            

            例如,在这个例子中,我们提供了两个不同的数据框。

            在第一个数据框中,每个员工报告了员工编号、姓名和部门。

            而在第二个数据框中,提供了员工编号和每个员工的薪水。

            询问PandasAI,它将通过员工编号将这两个不同的数据框连接起来,并找出薪水最高的员工的姓名。

            # 创建员工信息的数据框
            employees_df = pd.DataFrame(
                {
                    "EmployeeID": [1, 2, 3, 4, 5],
                    "Name": ["John", "Emma", "Liam", "Olivia", "William"],
                    "Department": ["HR", "Sales", "IT", "Marketing", "Finance"],
                }
            )
            # 创建薪资信息的数据框
            salaries_df = pd.DataFrame(
                {
                    "EmployeeID": [1, 2, 3, 4, 5],
                    "Salary": [5000, 6000, 4500, 7000, 5500],
                }
            )
            # 创建SmartDatalake对象,并将员工信息和薪资信息作为参数传入
            lake = SmartDatalake(
                [employees_df, salaries_df],
                config={"llm": llm}
            )
            # 调用chat方法,传入问题"Who gets paid the most?",返回结果
            lake.chat("Who gets paid the most?")
            

            这是一个生成的代码示例:

            # 打印变量lake中存储的最后一次执行的代码
            print(lake.last_code_executed)
            
            def analyze_data(dfs: list[pd.DataFrame]) ->dict:
                """
                Analyze the data
                1. Prepare: Preprocessing and cleaning data if necessary
                2. Process: Manipulating data for analysis (grouping, filtering, aggregating, etc.)
                3. Analyze: Conducting the actual analysis (if the user asks to plot a chart save it to an image in exports/charts/temp_chart.png and do not show the chart.)
                4. Output: return a dictionary of:
                - type (possible values "text", "number", "dataframe", "plot")
                - value (can be a string, a dataframe or the path of the plot, NOT a dictionary)
                Example output: { "type": "text", "value": "The average loan amount is $15,000." }
                """
                merged_df = pd.merge(dfs[0], dfs[1], on='EmployeeID')
                max_salary_employee = merged_df.loc[merged_df['Salary'].idxmax()]
                employee_name = max_salary_employee['Name']
                return {'type': 'text', 'value': f'The employee who gets paid the most is {employee_name}.'}
            

            好的,在这种情况下很容易:两个表都共享一个名为EmployeeID的公共值,对吗?

            让我们试试更复杂的情况

            # 创建一个包含用户信息的DataFrame
            users_df = pd.DataFrame(
                {
                    "id": [1, 2, 3, 4, 5],
                    "name": ["John", "Emma", "Liam", "Olivia", "William"]
                }
            )
            # 创建一个名为"users"的SmartDataframe对象,用于处理用户信息
            users = SmartDataframe(users_df, name="users")
            # 创建一个包含照片信息的DataFrame
            photos_df = pd.DataFrame(
                {
                    "id": [31, 32, 33, 34, 35],
                    "user_id": [1, 1, 2, 4, 5]
                }
            )
            # 创建一个名为"photos"的SmartDataframe对象,用于处理照片信息
            photos = SmartDataframe(photos_df, name="photos")
            # 创建一个SmartDatalake对象,将"users"和"photos"作为参数传入,并设置配置项
            lake = SmartDatalake([users, photos], config={"llm": llm})
            # 调用SmartDatalake对象的chat方法,向其提问"John上传了多少张照片?"
            lake.chat("How many photos has been uploaded by John?")
            

            在这种情况下,我们为每个数据框提供了一个表名,这样LLM就有了一些上下文,并且可以更好地执行连接操作。正如您在下面的示例中所看到的,它成功地找出了正确的连接方式。实际上,用户"John"实际上有2张照片。

            # 打印lake变量中存储的最后一次执行的代码
            print(lake.last_code_executed)
            
            def analyze_data(dfs: list[pd.DataFrame]) ->dict:
                users = dfs[0]
                photos = dfs[1]
                merged_df = pd.merge(users, photos, left_on='id', right_on='user_id')
                john_photos = merged_df[merged_df['name'] == 'John']
                num_photos = john_photos.shape[0]
                return {'type': 'number', 'value': num_photos}
            result = analyze_data(dfs)
            

            不同的LLM

            尽管目前OpenAI GPT3.5和GPT4是推荐的模型,我们也支持其他模型,如Starcoder和Falcon。

            您可以按照以下方式使用它们:

            # 导入所需的库
            from pandasai import SmartDataframe
            from pandasai.llm import Starcoder, Falcon
            # 创建一个Starcoder对象,并传入API令牌
            starcoder_llm = Starcoder(api_token="YOUR TOKEN")
            # 创建一个Falcon对象,并传入API令牌
            falcon_llm = Falcon(api_token="YOUR TOKEN")
            # 使用Starcoder对象创建一个SmartDataframe对象,并传入数据框和配置参数
            df1 = SmartDataframe(df, config={"llm": starcoder_llm})
            # 使用Falcon对象创建一个SmartDataframe对象,并传入数据框和配置参数
            df2 = SmartDataframe(df, config={"llm": falcon_llm})
            # 打印使用df1对象进行的聊天操作的结果
            print(df1.chat("Which country has the highest GDP?"))
            # 打印使用df2对象进行的聊天操作的结果
            print(df2.chat("Which one is the unhappiest country?"))
            

            LangChain LLMs

            在某些情况下,您可能希望使用LangChain LLMs。

            # 安装pandasai[langchain]模块
            !pip install pandasai[langchain]
            
            Requirement already satisfied: pandasai[langchain] in /usr/local/lib/python3.10/dist-packages (1.1.1)
            Requirement already satisfied: astor=0.8.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (0.8.1)
            Requirement already satisfied: ipython=8.13.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (8.15.0)
            Requirement already satisfied: matplotlib=3.7.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (3.7.1)
            Requirement already satisfied: openai=0.27.5 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (0.27.10)
            Requirement already satisfied: pandas==1.5.3 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (1.5.3)
            Requirement already satisfied: pydantic=1 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (1.10.12)
            Requirement already satisfied: python-dotenv=1.0.0 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (1.0.0)
            Requirement already satisfied: scipy=1.9.0 in /usr/local/lib/python3.10/dist-packages (from pandasai[langchain]) (1.10.1)
            Collecting langchain=0.0.199 (from pandasai[langchain])
              Downloading langchain-0.0.199-py3-none-any.whl (1.0 MB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m6.7 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai[langchain]) (2.8.2)
            Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai[langchain]) (2023.3)
            Requirement already satisfied: numpy>=1.21.0 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai[langchain]) (1.23.5)
            Requirement already satisfied: backcall in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[langchain]) (0.2.0)
            Requirement already satisfied: decorator in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[langchain]) (4.4.2)
            Requirement already satisfied: jedi>=0.16 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[langchain]) (0.19.0)
            Requirement already satisfied: matplotlib-inline in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[langchain]) (0.1.6)
            Requirement already satisfied: pickleshare in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[langchain]) (0.7.5)
            Requirement already satisfied: prompt-toolkit!=3.0.37,=3.0.30 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[langchain]) (3.0.39)
            Requirement already satisfied: pygments>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[langchain]) (2.16.1)
            Requirement already satisfied: stack-data in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[langchain]) (0.6.2)
            Requirement already satisfied: traitlets>=5 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[langchain]) (5.7.1)
            Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[langchain]) (1.1.3)
            Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[langchain]) (4.8.0)
            Requirement already satisfied: PyYAML>=5.4.1 in /usr/local/lib/python3.10/dist-packages (from langchain=0.0.199->pandasai[langchain]) (6.0.1)
            Requirement already satisfied: SQLAlchemy=1.4 in /usr/local/lib/python3.10/dist-packages (from langchain=0.0.199->pandasai[langchain]) (2.0.20)
            Requirement already satisfied: aiohttp=3.8.3 in /usr/local/lib/python3.10/dist-packages (from langchain=0.0.199->pandasai[langchain]) (3.8.5)
            Requirement already satisfied: async-timeout=4.0.0 in /usr/local/lib/python3.10/dist-packages (from langchain=0.0.199->pandasai[langchain]) (4.0.3)
            Collecting dataclasses-json=0.5.7 (from langchain=0.0.199->pandasai[langchain])
              Downloading dataclasses_json-0.5.14-py3-none-any.whl (26 kB)
            Collecting langchainplus-sdk>=0.0.9 (from langchain=0.0.199->pandasai[langchain])
              Downloading langchainplus_sdk-0.0.20-py3-none-any.whl (25 kB)
            Requirement already satisfied: numexpr=2.8.4 in /usr/local/lib/python3.10/dist-packages (from langchain=0.0.199->pandasai[langchain]) (2.8.5)
            Collecting openapi-schema-pydantic=1.2 (from langchain=0.0.199->pandasai[langchain])
              Downloading openapi_schema_pydantic-1.2.4-py3-none-any.whl (90 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m90.0/90.0 kB[0m [31m11.3 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: requests=2 in /usr/local/lib/python3.10/dist-packages (from langchain=0.0.199->pandasai[langchain]) (2.31.0)
            Requirement already satisfied: tenacity=8.1.0 in /usr/local/lib/python3.10/dist-packages (from langchain=0.0.199->pandasai[langchain]) (8.2.3)
            Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[langchain]) (1.1.0)
            Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[langchain]) (0.11.0)
            Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[langchain]) (4.42.1)
            Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[langchain]) (1.4.4)
            Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[langchain]) (23.1)
            Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[langchain]) (9.4.0)
            Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[langchain]) (3.1.1)
            Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from openai=0.27.5->pandasai[langchain]) (4.66.1)
            Requirement already satisfied: typing-extensions>=4.2.0 in /usr/local/lib/python3.10/dist-packages (from pydantic=1->pandasai[langchain]) (4.7.1)
            Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp=3.8.3->langchain=0.0.199->pandasai[langchain]) (23.1.0)
            Requirement already satisfied: charset-normalizer=2.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp=3.8.3->langchain=0.0.199->pandasai[langchain]) (3.2.0)
            Requirement already satisfied: multidict=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp=3.8.3->langchain=0.0.199->pandasai[langchain]) (6.0.4)
            Requirement already satisfied: yarl=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp=3.8.3->langchain=0.0.199->pandasai[langchain]) (1.9.2)
            Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp=3.8.3->langchain=0.0.199->pandasai[langchain]) (1.4.0)
            Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp=3.8.3->langchain=0.0.199->pandasai[langchain]) (1.3.1)
            Collecting marshmallow=3.18.0 (from dataclasses-json=0.5.7->langchain=0.0.199->pandasai[langchain])
              Downloading marshmallow-3.20.1-py3-none-any.whl (49 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.4/49.4 kB[0m [31m6.8 MB/s[0m eta [36m0:00:00[0m
            [?25hCollecting typing-inspect=0.4.0 (from dataclasses-json=0.5.7->langchain=0.0.199->pandasai[langchain])
              Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)
            Requirement already satisfied: parso=0.8.3 in /usr/local/lib/python3.10/dist-packages (from jedi>=0.16->ipython=8.13.1->pandasai[langchain]) (0.8.3)
            Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.10/dist-packages (from pexpect>4.3->ipython=8.13.1->pandasai[langchain]) (0.7.0)
            Requirement already satisfied: wcwidth in /usr/local/lib/python3.10/dist-packages (from prompt-toolkit!=3.0.37,=3.0.30->ipython=8.13.1->pandasai[langchain]) (0.2.6)
            Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas==1.5.3->pandasai[langchain]) (1.16.0)
            Requirement already satisfied: idna=2.5 in /usr/local/lib/python3.10/dist-packages (from requests=2->langchain=0.0.199->pandasai[langchain]) (3.4)
            Requirement already satisfied: urllib3=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests=2->langchain=0.0.199->pandasai[langchain]) (2.0.4)
            Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests=2->langchain=0.0.199->pandasai[langchain]) (2023.7.22)
            Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from SQLAlchemy=1.4->langchain=0.0.199->pandasai[langchain]) (2.0.2)
            Requirement already satisfied: executing>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from stack-data->ipython=8.13.1->pandasai[langchain]) (1.2.0)
            Requirement already satisfied: asttokens>=2.1.0 in /usr/local/lib/python3.10/dist-packages (from stack-data->ipython=8.13.1->pandasai[langchain]) (2.3.0)
            Requirement already satisfied: pure-eval in /usr/local/lib/python3.10/dist-packages (from stack-data->ipython=8.13.1->pandasai[langchain]) (0.2.2)
            Collecting mypy-extensions>=0.3.0 (from typing-inspect=0.4.0->dataclasses-json=0.5.7->langchain=0.0.199->pandasai[langchain])
              Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
            Installing collected packages: mypy-extensions, marshmallow, typing-inspect, openapi-schema-pydantic, langchainplus-sdk, dataclasses-json, langchain
            Successfully installed dataclasses-json-0.5.14 langchain-0.0.199 langchainplus-sdk-0.0.20 marshmallow-3.20.1 mypy-extensions-1.0.0 openapi-schema-pydantic-1.2.4 typing-inspect-0.9.0
            

            然后您可以将它们用作PandasAI LLMs。

            # 导入所需的库
            from pandasai import SmartDataframe
            from langchain.llms import OpenAI
            # from langchain.llms import Anthropic
            # from langchain.llms import LlamaCpp
            # 创建一个OpenAI实例,传入你的API密钥和最大token数
            langchain_llm = OpenAI(openai_api_key="YOUR TOKEN", max_tokens=1000)
            # 创建一个SmartDataframe实例,传入数据框和配置参数
            langchain_sdf = SmartDataframe(df, config={"llm": langchain_llm})
            # 调用chat方法,向模型提问
            langchain_sdf.chat("Which are the top 5 countries by GPD?")
            

            连接器

            PandasAI提供了许多连接器,允许您连接到不同的数据源。这些连接器被设计成易于使用,即使您对数据源或PandasAI不熟悉。

            要使用连接器,您首先需要安装所需的依赖项。您可以通过运行以下命令来完成此操作:

            # 安装pandasai[connectors]包
            !pip install pandasai[connectors]
            
            Requirement already satisfied: pandasai[connectors] in /usr/local/lib/python3.10/dist-packages (1.2.7)
            Requirement already satisfied: astor=0.8.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (0.8.1)
            Requirement already satisfied: duckdb=0.8.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (0.8.1)
            Requirement already satisfied: ipython=8.13.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (8.15.0)
            Requirement already satisfied: matplotlib=3.7.1 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (3.7.1)
            Requirement already satisfied: openai=0.27.5 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (0.27.10)
            Requirement already satisfied: pandas==1.5.3 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (1.5.3)
            Requirement already satisfied: pydantic=1 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (1.10.12)
            Requirement already satisfied: python-dotenv=1.0.0 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (1.0.0)
            Requirement already satisfied: scipy=1.9.0 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (1.11.2)
            Requirement already satisfied: sqlalchemy=1.4.49 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (1.4.49)
            Requirement already satisfied: psycopg2=2.9.7 in /usr/local/lib/python3.10/dist-packages (from pandasai[connectors]) (2.9.7)
            Collecting pymysql=1.1.0 (from pandasai[connectors])
              Downloading PyMySQL-1.1.0-py3-none-any.whl (44 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.8/44.8 kB[0m [31m709.3 kB/s[0m eta [36m0:00:00[0m
            [?25hCollecting snowflake-sqlalchemy=1.5.0 (from pandasai[connectors])
              Downloading snowflake_sqlalchemy-1.5.0-py2.py3-none-any.whl (33 kB)
            Collecting sqlalchemy-databricks=0.2.0 (from pandasai[connectors])
              Downloading sqlalchemy_databricks-0.2.0-py3-none-any.whl (4.3 kB)
            Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai[connectors]) (2.8.2)
            Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai[connectors]) (2023.3.post1)
            Requirement already satisfied: numpy>=1.21.0 in /usr/local/lib/python3.10/dist-packages (from pandas==1.5.3->pandasai[connectors]) (1.23.5)
            Requirement already satisfied: backcall in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[connectors]) (0.2.0)
            Requirement already satisfied: decorator in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[connectors]) (4.4.2)
            Requirement already satisfied: jedi>=0.16 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[connectors]) (0.19.0)
            Requirement already satisfied: matplotlib-inline in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[connectors]) (0.1.6)
            Requirement already satisfied: pickleshare in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[connectors]) (0.7.5)
            Requirement already satisfied: prompt-toolkit!=3.0.37,=3.0.30 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[connectors]) (3.0.39)
            Requirement already satisfied: pygments>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[connectors]) (2.16.1)
            Requirement already satisfied: stack-data in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[connectors]) (0.6.2)
            Requirement already satisfied: traitlets>=5 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[connectors]) (5.7.1)
            Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[connectors]) (1.1.3)
            Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.10/dist-packages (from ipython=8.13.1->pandasai[connectors]) (4.8.0)
            Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[connectors]) (1.1.0)
            Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[connectors]) (0.11.0)
            Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[connectors]) (4.42.1)
            Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[connectors]) (1.4.5)
            Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[connectors]) (23.1)
            Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[connectors]) (9.4.0)
            Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib=3.7.1->pandasai[connectors]) (3.1.1)
            Requirement already satisfied: requests>=2.20 in /usr/local/lib/python3.10/dist-packages (from openai=0.27.5->pandasai[connectors]) (2.31.0)
            Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from openai=0.27.5->pandasai[connectors]) (4.66.1)
            Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from openai=0.27.5->pandasai[connectors]) (3.8.5)
            Requirement already satisfied: typing-extensions>=4.2.0 in /usr/local/lib/python3.10/dist-packages (from pydantic=1->pandasai[connectors]) (4.5.0)
            Collecting snowflake-connector-pythonpandasai[connectors])
              Downloading snowflake_connector_python-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (24.6 MB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m34.3 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from sqlalchemy=1.4.49->pandasai[connectors]) (2.0.2)
            Collecting PyHive=0 (from sqlalchemy-databricks=0.2.0->pandasai[connectors])
              Downloading PyHive-0.7.0.tar.gz (46 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.5/46.5 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
            [?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
            Collecting databricks-sql-connector=2 (from sqlalchemy-databricks=0.2.0->pandasai[connectors])
              Downloading databricks_sql_connector-2.9.3-py3-none-any.whl (297 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m297.3/297.3 kB[0m [31m25.2 MB/s[0m eta [36m0:00:00[0m
            [?25hCollecting alembic=1.0.11 (from databricks-sql-connector=2->sqlalchemy-databricks=0.2.0->pandasai[connectors])
              Downloading alembic-1.12.0-py3-none-any.whl (226 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m226.0/226.0 kB[0m [31m22.3 MB/s[0m eta [36m0:00:00[0m
            [?25hCollecting lz4=4.0.2 (from databricks-sql-connector=2->sqlalchemy-databricks=0.2.0->pandasai[connectors])
              Downloading lz4-4.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m56.0 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: oauthlib=3.1.0 in /usr/local/lib/python3.10/dist-packages (from databricks-sql-connector=2->sqlalchemy-databricks=0.2.0->pandasai[connectors]) (3.2.2)
            Requirement already satisfied: openpyxl=3.0.10 in /usr/local/lib/python3.10/dist-packages (from databricks-sql-connector=2->sqlalchemy-databricks=0.2.0->pandasai[connectors]) (3.1.2)
            Requirement already satisfied: pyarrow>=6.0.0 in /usr/local/lib/python3.10/dist-packages (from databricks-sql-connector=2->sqlalchemy-databricks=0.2.0->pandasai[connectors]) (9.0.0)
            Collecting thrift=0.16.0 (from databricks-sql-connector=2->sqlalchemy-databricks=0.2.0->pandasai[connectors])
              Downloading thrift-0.16.0.tar.gz (59 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m59.6/59.6 kB[0m [31m6.7 MB/s[0m eta [36m0:00:00[0m
            [?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
            Requirement already satisfied: urllib3>=1.0 in /usr/local/lib/python3.10/dist-packages (from databricks-sql-connector=2->sqlalchemy-databricks=0.2.0->pandasai[connectors]) (2.0.4)
            Requirement already satisfied: parso=0.8.3 in /usr/local/lib/python3.10/dist-packages (from jedi>=0.16->ipython=8.13.1->pandasai[connectors]) (0.8.3)
            Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.10/dist-packages (from pexpect>4.3->ipython=8.13.1->pandasai[connectors]) (0.7.0)
            Requirement already satisfied: wcwidth in /usr/local/lib/python3.10/dist-packages (from prompt-toolkit!=3.0.37,=3.0.30->ipython=8.13.1->pandasai[connectors]) (0.2.6)
            Requirement already satisfied: future in /usr/local/lib/python3.10/dist-packages (from PyHive=0->sqlalchemy-databricks=0.2.0->pandasai[connectors]) (0.18.3)
            Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas==1.5.3->pandasai[connectors]) (1.16.0)
            Requirement already satisfied: charset-normalizer=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai=0.27.5->pandasai[connectors]) (3.2.0)
            Requirement already satisfied: idna=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai=0.27.5->pandasai[connectors]) (3.4)
            Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.20->openai=0.27.5->pandasai[connectors]) (2023.7.22)
            Collecting asn1crypto0.24.0 (from snowflake-connector-pythonsnowflake-sqlalchemy=1.5.0->pandasai[connectors])
              Downloading asn1crypto-1.5.1-py2.py3-none-any.whl (105 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m105.0/105.0 kB[0m [31m11.4 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: cffi=1.9 in /usr/local/lib/python3.10/dist-packages (from snowflake-connector-pythonsnowflake-sqlalchemy=1.5.0->pandasai[connectors]) (1.15.1)
            Requirement already satisfied: cryptography=3.1.0 in /usr/local/lib/python3.10/dist-packages (from snowflake-connector-pythonsnowflake-sqlalchemy=1.5.0->pandasai[connectors]) (41.0.3)
            Collecting oscryptopandasai[connectors])
              Downloading oscrypto-1.3.0-py2.py3-none-any.whl (194 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.6/194.6 kB[0m [31m18.8 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: pyOpenSSL=16.2.0 in /usr/local/lib/python3.10/dist-packages (from snowflake-connector-pythonsnowflake-sqlalchemy=1.5.0->pandasai[connectors]) (23.2.0)
            Collecting pycryptodomex!=3.5.0,=3.2 (from snowflake-connector-pythonsnowflake-sqlalchemy=1.5.0->pandasai[connectors])
              Downloading pycryptodomex-3.19.0-cp35-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m43.4 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: pyjwtpandasai[connectors]) (2.3.0)
            Collecting urllib3>=1.0 (from databricks-sql-connector=2->sqlalchemy-databricks=0.2.0->pandasai[connectors])
              Downloading urllib3-1.26.16-py2.py3-none-any.whl (143 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m143.1/143.1 kB[0m [31m14.7 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: filelock=3.5 in /usr/local/lib/python3.10/dist-packages (from snowflake-connector-pythonsnowflake-sqlalchemy=1.5.0->pandasai[connectors]) (3.12.2)
            Requirement already satisfied: sortedcontainers>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from snowflake-connector-pythonsnowflake-sqlalchemy=1.5.0->pandasai[connectors]) (2.4.0)
            Collecting platformdirs=2.6.0 (from snowflake-connector-pythonsnowflake-sqlalchemy=1.5.0->pandasai[connectors])
              Downloading platformdirs-3.8.1-py3-none-any.whl (16 kB)
            Collecting tomlkit (from snowflake-connector-pythonsnowflake-sqlalchemy=1.5.0->pandasai[connectors])
              Downloading tomlkit-0.12.1-py3-none-any.whl (37 kB)
            Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai=0.27.5->pandasai[connectors]) (23.1.0)
            Requirement already satisfied: multidict=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai=0.27.5->pandasai[connectors]) (6.0.4)
            Requirement already satisfied: async-timeout=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai=0.27.5->pandasai[connectors]) (4.0.3)
            Requirement already satisfied: yarl=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai=0.27.5->pandasai[connectors]) (1.9.2)
            Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai=0.27.5->pandasai[connectors]) (1.4.0)
            Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->openai=0.27.5->pandasai[connectors]) (1.3.1)
            Requirement already satisfied: executing>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from stack-data->ipython=8.13.1->pandasai[connectors]) (1.2.0)
            Requirement already satisfied: asttokens>=2.1.0 in /usr/local/lib/python3.10/dist-packages (from stack-data->ipython=8.13.1->pandasai[connectors]) (2.4.0)
            Requirement already satisfied: pure-eval in /usr/local/lib/python3.10/dist-packages (from stack-data->ipython=8.13.1->pandasai[connectors]) (0.2.2)
            Collecting Mako (from alembic=1.0.11->databricks-sql-connector=2->sqlalchemy-databricks=0.2.0->pandasai[connectors])
              Downloading Mako-1.2.4-py3-none-any.whl (78 kB)
            [2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m78.7/78.7 kB[0m [31m8.9 MB/s[0m eta [36m0:00:00[0m
            [?25hRequirement already satisfied: pycparser in /usr/local/lib/python3.10/dist-packages (from cffi=1.9->snowflake-connector-pythonsnowflake-sqlalchemy=1.5.0->pandasai[connectors]) (2.21)
            Requirement already satisfied: et-xmlfile in /usr/local/lib/python3.10/dist-packages (from openpyxl=3.0.10->databricks-sql-connector=2->sqlalchemy-databricks=0.2.0->pandasai[connectors]) (1.1.0)
            Requirement already satisfied: MarkupSafe>=0.9.2 in /usr/local/lib/python3.10/dist-packages (from Mako->alembic=1.0.11->databricks-sql-connector=2->sqlalchemy-databricks=0.2.0->pandasai[connectors]) (2.1.3)
            Building wheels for collected packages: PyHive, thrift
              Building wheel for PyHive (setup.py) ... [?25l[?25hdone
              Created wheel for PyHive: filename=PyHive-0.7.0-py3-none-any.whl size=53872 sha256=1d2a90767825eb44f25f15a386a3191b47df6b0fe2ee1c8b3718ad3c9e9c3592
              Stored in directory: /root/.cache/pip/wheels/d3/fc/31/6974270c69ccc5bf8f848e2e41b527d0e8f5b9b973696a29a9
              Building wheel for thrift (setup.py) ... [?25l[?25hdone
              Created wheel for thrift: filename=thrift-0.16.0-cp310-cp310-linux_x86_64.whl size=373871 sha256=9b6ad9cfb506732a6582e3d6f87721c28d3255de8eeeadf47ed54445c360ad89
              Stored in directory: /root/.cache/pip/wheels/52/f8/d2/acfd995e8247eb0cad372fa6a640a5fcf279ab2ed7c5c4490e
            Successfully built PyHive thrift
            Installing collected packages: asn1crypto, urllib3, tomlkit, thrift, pymysql, pycryptodomex, platformdirs, oscrypto, Mako, lz4, PyHive, alembic, databricks-sql-connector, sqlalchemy-databricks, snowflake-connector-python, snowflake-sqlalchemy
              Attempting uninstall: urllib3
                Found existing installation: urllib3 2.0.4
                Uninstalling urllib3-2.0.4:
                  Successfully uninstalled urllib3-2.0.4
              Attempting uninstall: platformdirs
                Found existing installation: platformdirs 3.10.0
                Uninstalling platformdirs-3.10.0:
                  Successfully uninstalled platformdirs-3.10.0
            Successfully installed Mako-1.2.4 PyHive-0.7.0 alembic-1.12.0 asn1crypto-1.5.1 databricks-sql-connector-2.9.3 lz4-4.3.2 oscrypto-1.3.0 platformdirs-3.8.1 pycryptodomex-3.19.0 pymysql-1.1.0 snowflake-connector-python-3.2.0 snowflake-sqlalchemy-1.5.0 sqlalchemy-databricks-0.2.0 thrift-0.16.0 tomlkit-0.12.1 urllib3-1.26.16
            
            # 导入MySQLConnector和PostgreSQLConnector类
            from pandasai.connectors import MySQLConnector, PostgreSQLConnector
            # 使用MySQL数据库
            loan_connector = MySQLConnector(
                config={
                    "host": "localhost", # 主机名
                    "port": 3306, # 端口号
                    "database": "mydb", # 数据库名
                    "username": "root", # 用户名
                    "password": "root", # 密码
                    "table": "loans", # 表名
                    "where": [
                        # 这是可选的,用于过滤数据以减少数据框的大小
                        ["loan_status", "=", "PAIDOFF"], # 过滤条件
                    ],
                }
            )
            # 使用PostgreSQL数据库
            payment_connector = PostgreSQLConnector(
                config={
                    "host": "localhost", # 主机名
                    "port": 5432, # 端口号
                    "database": "mydb", # 数据库名
                    "username": "root", # 用户名
                    "password": "root", # 密码
                    "table": "payments", # 表名
                    "where": [
                        # 这是可选的,用于过滤数据以减少数据框的大小
                        ["payment_status", "=", "PAIDOFF"], # 过滤条件
                    ],
                }
            )
            # 创建SmartDatalake对象,将MySQLConnector和PostgreSQLConnector对象作为参数传入
            df_connector = SmartDatalake([loan_connector, payment_connector], config={"llm": llm})
            # 调用chat方法,传入问题作为参数,返回答案
            response = df_connector.chat("How many loans from the United states?")
            print(response)
            
            # 导入YahooFinanceConnector模块
            from pandasai.connectors.yahoo_finance import YahooFinanceConnector
            # 创建一个YahooFinanceConnector对象,参数为股票代码"MSFT"
            yahoo_connector = YahooFinanceConnector("MSFT")
            # 使用YahooFinanceConnector对象创建一个SmartDataframe对象,同时传入配置参数{"llm": llm}
            df = SmartDataframe(yahoo_connector, config={"llm": llm})
            # 使用SmartDataframe对象的chat方法进行对话,参数为询问昨天的收盘价
            response = df.chat("What is the closing price for yesterday?")
            # 打印返回的结果
            print(response)
            
            The closing price for yesterday was $319.53.
            
            # 创建一个YahooFinanceConnector对象,传入参数为股票代码"TSLA"
            yahoo_connector = YahooFinanceConnector("TSLA")
            # 创建一个SmartDataframe对象,传入参数为yahoo_connector和配置参数{"llm": llm}
            df_connector = SmartDataframe(yahoo_connector, config={"llm": llm})
            # 调用df_connector的chat方法,传入参数为"Plot the chart of tesla over time",返回结果赋值给response
            response = df_connector.chat("Plot the chart of tesla over time")
            

            您可以在此处找到有关连接器(以及更多连接器)的更多信息:https://docs.pandas-ai.com/en/latest/connectors/

微信扫一扫加客服

微信扫一扫加客服