Releases: microsoft/msticpy
Compatibility release - Bokeh, VirusTotal, AzureCredentials
This is mainly a release to fix compatibility problems with some features deprecated in Bokeh 3.7
Also includes fixes for different behaviour of VTObject in VirusTotalV3 code - which messes up conversion to pandas dataframes.
Finally I've added some fixes for using AzureCliCredential and ManagedIdentityCredential. In cases where you
are using AzureCLI authentication with a ManagedIdentity (such as in AzureML compute), the credential
fails if you supply a TenantId when creating. The code now checks that it can obtain a token and, if not, falls
back to creating the credential with no tenantId.
Similarly, the default for ManagedIdentityCredential is now to create it only passing client_id (or None if this is not defined).
It will fallback to previous behavior, if this fails.
If that also fails, it will fall back to creating the credential with no parameters.
What's Changed
- Compat fixes for Bokeh 3.7 by @ianhelle in #840
- Avoiding vulnerable dependencies by @ianhelle in #843
- updating cryptography to >=43.0.1
- Add explicit dependencies for jinja2>=3.1.5 and tornado>=6.4.2 to avoid vulnerable versions
Full Changelog: v2.16.1...v2.16.2.post
Maintenance release QueryEditor, PrismaCloudDriver
Highlights
This is largely a "fix-and-improve" release.
Some important fixes to:
- QueryEditor
- Pagination and retry capability added to Prisma Cloud Driver
- Dataclass issue
As of this release we are switching from GitHub actions publishing to an Azure DevOps pipeline
(this is a Microsoft internal security requirement for enhancing supply chain security). It should not
affect your enjoyment of this package :-)
What's Changed
- Fix dataclass issue by @FlorianBracq in #833
- Add typing for Defender by @FlorianBracq in #828
- Edit
process_cmd_line
function template to accept list as parameter by @vx3r in #835 - Fix to QueryEditor by @ianhelle in #836
- Add pagination and retries based on load, support queries - PrismaCloudDriver by @raj-axe in #834
- Adding Azure Publishing pipeline by @ianhelle in #838
Full Changelog: v2.16.0...v2.16.1
Cyberint TI provider and Prisma Cloud (Palo Alto) Data provider
Prisma Cloud Driver
This pull request adds support for integrating Prisma Cloud into MSTICPy. By including a dedicated PrismaCloudDriver, the goal is to enable querying and analyzing data from Prisma Cloud’s APIs within MSTICPy’s data analysis framework.
The Prisma Cloud Driver, developed by Palo Alto Networks, integrates MSTICPy with Prisma Cloud’s security platform. It enables seamless authentication, querying, and data retrieval from Prisma Cloud’s assets, configurations, and events. By incorporating this driver, MSTICPy users gain streamlined access to cloud security data, allowing to perform in-depth threat analysis, compliance checks, and security investigations directly within their existing data analysis workflows
Big thanks to @raj-axe for this
Cyberint TI Provider
TI provider uses the Cyberint API for IoC lookup.
Azure Sentinel/Azure Monitor
We've had a bit of activity around Azure Sentinel/Azure Monitor.
@JPvRiel has been digging into this and found a few bugs. They also raised the issue the current Azure monitor driver
has no support for custom tables. I created an experimental driver in this release but it's not working as expected.
If anyone wants to take up the sword and tackle bugs #829, #830 and #831 I would appreciate your help.
#831 is specifically the problems with the experimental driver
The other two are bugs in the existing Azure Monitor/Sentinel provider. (although I'm not the support for parsing time ranges is an easy fix since we're relying on the azure.monitor.query SDK to do this conversion.
Thanks to @vx3r for this.
Certificate Authentication support for OData drivers (Defender and MSGraph)
Thanks to @FlorianBracq for this.
Other changes
Lots more typing work by our esteemed @FlorianBracq
Various fixes but some important ones:
- Maxmind API change
- Bokeh (should now support current Bokeh versions)
- Panel (workaround for seeming bug in 1.16.1)
What's Changed
- Fix typing issue for FoliumMap by @FlorianBracq in #814
- Add Azure kusto driver typing by @FlorianBracq in #816
- Odata certificate support by @FlorianBracq in #812
- Fix change to maxmind API 2.6.3 by @ianhelle in #823
- Apply typing to the Cybereason driver by @FlorianBracq in #813
- add Cyberint TI provider by @vx3r in #817
- Ianhelle/update to v2.16.0 by @ianhelle in #824
- Ianhelle/az monitor search driver 2025 02 05 by @ianhelle in #825
- Fixed autogen package by @ekzhu in #818
- prisma_cloud driver by @raj-axe in #821
- Updating bokeh code to support 3.4.0+ by @ianhelle in #826
- Cyberint risk key none value by @vx3r in #832
New Contributors
Full Changelog: v2.15.0...v2.16.0
Multi-dimensional plots for outliers
Highlights
Multi-dimensional plots for outliers by @Tatsuya-hasegawa
The outliers module has lived in MSTICPy for a long time but been some neglected
@Tatsuya-hasegawa (hacker-T) has contributed some cool visualizations to
better interpret the data.
Many thanks!!!
import numpy as np
from msticpy.analysis.outliers import identify_outliers,plot_outlier_results
n_dimension = 7
# create random numeric samples
data = np.random.rand(100, n_dimension)
# calc outliers by Isolation Forest algorism
clf, X_outliers, y_pred_outliers = identify_outliers(data, data, contamination=0.1, max_features=0.4)
feature_columns = [f'feature{i}' for i in range(1, n_dimension+1)]
plot_outlier_results(
clf,
data,
data,
X_outliers,
feature_columns=feature_columns,
plt_title="MSTICPY Isolation Forest Anomaly Detection for Multi Dimension Features"
)
Improved code/docs for federated authentication for M365D/M356 Graph providers - @ryan-detect-dot-dev
Although using federated auth (rather than client secret) has been possible for a while, the documentation
for how to use this was in the MSTICPy docs. Thanks to Ryan we now have this (along with cleaned up code
for the Defender* data providers.
(although Ryan is listed as a new contributor below - he has made several previous contributions under
a different GitHub identity)
Rigorous Type Annotation work started by @FlorianBracq earlier this year continues.
This helps to make the code more robust and clearer to read and use. This is thankless work but my
huge thanks go out to @FlorianBracq for this!
Other fixes
Some other important fixes to CyberReason driver and Azure Monitor/MS Sentinel driver are also included
What's Changed
- Cybereason driver fix http429 tests and exception by @vx3r in #803
- Cybereason driver query return instance name in dataframe by @vx3r in #804
- Add multi dimension plots to analysis.outliers module. by @Tatsuya-hasegawa in #805
- Avoid httpx 0.28.0 for unit tests by @ianhelle in #811
- Add typing hints to core classes by @FlorianBracq in #810
- Fixing azure_monitor_driver for deprecated httpx API by @ianhelle in #809
- Update version to 2.15.0 by @ianhelle in #806
- Update MDATP Driver for delegated auth by @ryan-detect-dot-dev in #784
New Contributors
- @ryan-detect-dot-dev made their first contribution in #784
Full Changelog: v2.14.0...v2.15.0
User Session Management, MaxMind Geolit fix, Extract nested dicts from Pandas
User Session Configuration
Do you always have one or more data providers or other components that you need to load for every notebook you create?
I do, and got a bit fed up with typing the same lines of code over and over again.
User session configuration lets you specify which providers are loaded, whether or not to connect and which parameters
to supply at load and connect time. You put all of this into a straightforward YAML file and load it using the following:
import msticpy as mp # you likely will already be doing this
mp.init_notebook() # and this
mp.load_user_session("my_config.yaml") # if you have a "mp_user_session.yaml" in the current directory
# you can skip the parameter
This example shows the structure of the YAML:
QueryProviders:
qry_prov_sent:
DataEnvironment: MSSentinel
InitArgs:
debug: True
Connect: True
ConnectArgs:
workspace: MySoc
auth_methods: ['cli', 'device_code']
qry_prov_md:
DataEnvironment: M365D
Components:
mssentinel:
Module: msticpy.context.azure
Class: MicrosoftSentinel
InitArgs:
Connect: True
ConnectArgs:
workspace: MySoc
auth_methods: ['cli', 'device_code']
The providers/components created (e.g. qry_prov_sent
in this example)
are published back to your notebook Python namespace, so you'll see
these available as variables ready to use.
This configuration file is equivalent to the following code:
qry_prov_sent = mp.QueryProvider("MSSentinel")
qry_prov_sent.connect(workspace="MySoc", auth_methods=['cli', 'device_code'])
qry_prov_md = mp.QueryProvider("M365D")
from msticpy.context.azure import MicrosoftSentinel
mssentinel = MicrosoftSentinel()
mssentinel.connect(workspace="MySoc", auth_methods=['cli', 'device_code'])
Not a huge saving, on the face of it, but if you create a lot of notebooks or want to use
msticpy in an automation scenario, it can be very helpful.
Include a verbose=True
parameter to load_user_session
to see more detailed logging of what is going on.
See the full documentation here
Maxmind GeoIPLite fix
Sometime recently (not too sure when) Maxmind changed their download procedure to use
a different URL and authentication mechanism. This was causing auto-update to fail. To use
the new mechanism you need to get your Maxmind User Account ID (login and look at your
account properties) and add that to your msticpyconfig.yaml
as shown below.
OtherProviders:
GeoIPLite:
Args:
AccountID: "1234567"
AuthKey:
EnvironmentVar: "MAXMIND_AUTH"
DBFolder: "~/.msticpy"
Provider: "GeoLiteLookup"
Extract nested dictionaries from pandas column to multiple rows/columns
@pioneerHitesh has added this as a new method in the mp_pivot
pandas extension:
data_df.mp_pivot.dict_to_dataframe(col="my_nested_column")
It returns a dataframe with the column recursively expanded:
- lists become new rows
- dictionaries become new columns
So a column with the following structure:
NCol | |
---|---|
0 | {'A': ['A1', 'A2', 'A3'], 'B': {'B1': 'B1-1', 'B2': 'B2-1'}} |
1 | {'A': ['A3', 'A4', 'A5'], 'B': {'B3': 'B3-1', 'B4': 'B4-1'}} |
my_df = src_df.mp_pivot.dict_to_dataframe(col="NCol")
my_df
Would be unpacked to:
A.0 | A.1 | A.2 | B.B1 | B.B2 | B.B3 | B.B4 | |
---|---|---|---|---|---|---|---|
0 | A1 | A2 | A3 | B1-1 | B2-1 | nan | nan |
1 | A3 | A4 | A5 | nan | nan | B3-1 | B4-1 |
What's Changed
- Authentication module unit test by @ianhelle in #800
- Use sessions config and GeoIP download failure by @ianhelle in #801
- Added Inbuilt function to extract nested JSON by @pioneerHitesh in #798
- Add max retry parameter to the execution prevent HTTP 429 by @vx3r in #802
New Contributors
- @pioneerHitesh made their first contribution in #798
- @vx3r made their first contribution in #802
Full Changelog: v2.13.1...v2.14.0
Hotfix for authentication error
We introduced a bug in azure_auth_core that caused Azure authentication to fail.
What's Changed
- Provider and lookup typing by @FlorianBracq in #795
- Fix for bug in azure_core_auth that fails authentication by @ianhelle in #799
Full Changelog: v2.13.0...v2.13.1
AI documentation assistant, BinaryEdge TI provider and other misc fixes
We've been quietly doing some work to introduce LLM/GPT/AI capabilities into msticpy.
@EileenG02 has helped us in that direction by building a document Q&A agent using Autogen.
You can try it out in a notebook using the following:
Load the magic extension
%load_ext msticpy.aiagents.mp_docs_rag_magic
Ask a question in a separate cell using the %%ask cell magic
%%ask
What are the three things that I need to connect to Azure Query Provider?
Awesome work @EileenG02!
There's also a new TI provider for BinaryEdge courtesy of @petebryan.
Alongside this there have been quite a few contributions to fix and improve things like:
- Splunk improvements (thanks @Tatsuya-hasegawa)
- Fixes for Sentinel provider get_alert_rules to use updated API (thanks @BWC-TomW)
- A massive amount of type annotation work and fixes to context/TI providers by @FlorianBracq
- Miscellaneous fixes to things like Sentinel TI provider, MSSentinel tidy-up to more consistently handle parameters,
correct use of the term CountryOrRegionName from CountryName in geolocation contexts.
The gory details of the PRs follow:
What's Changed
- Add extra tests and fixes to QueryProvider, DriverBase and (as)sync query handling by @FlorianBracq in #777
- Fix incorrect ref to ip_utils module in docs by @ianhelle in #779
- Fix some deprecation warnings by @FlorianBracq in #781
- Fixing np.NaN error and build warnings by @ianhelle in #785
- Removing data matching AV signatures by @ianhelle in #786
- Create codeql_updated.yml by @ianhelle in #787
- Update black requirement from <24.0.0,>=20.8b1 to >=20.8b1,<25.0.0 by @dependabot in #742
- Update docutils requirement from <0.20.0 to <0.22.0 by @dependabot in #768
- Add upload data styles to Splunk uploader by @Tatsuya-hasegawa in #776
- Added BinaryEdge provider by @petebryan in #780
- Update sentinel_analytics.py to update get_alert_rules to use new API version by @BWC-TomW in #789
- Fixing MSSentinel to obey parameters by @ianhelle in #791
- Add Autogen and RAG Agent to MSTICpy by @EileenG02 in #793
- Update TILookup and ContextLookup by @FlorianBracq in #794
- Fix sentinel TI provider by @ianhelle in #797
- Updating CountryName to CountryOrRegionName by @ianhelle in #796
New Contributors
- @BWC-TomW made their first contribution in #789
- @EileenG02 made their first contribution in #793
Full Changelog: v2.12.0...v2.13.0
Splunk and Sentinel Updates
Sentinel updates
WorkspaceConfig and Sentinel QueryProvider (azure_monito_driver) have had a few updates:
- handle both old (Kqlmagic) and standard connection string formats in WorkspaceConfig
- removing a lot of legacy code from WorkspaceConfig
- Allow additional connection parameters to be used with MSSentinel QueryProvider for
authentication parameters (e.g. you can now supply authentication parameters like "client_id", "client_secret" toquery_provider.connect()
) msticpyconfig.yaml
now supports using an "MSSentinel" key in place of "AzureSentinel"- Workspace entries in msticpyconfig.yaml support an
Args
subkey, where you can add authentication parameters - these will be supplied to theconnect()
method if not overridden on the command line. Like Args sections for other providers, the values here can be text or references to environment variables or Azure Key Vault secrets. - Fix to MSSentinel API update_incident to add full properties
Splunk Updates
- Added jwt authentication token expiry check.
Other fixes
Fix for vtlookup3.py
- Fixed problematic way of using nestasyncio - this was causing failures when run from a langchain agent.
Fix for lookup/tilookup - If the progress parameter was not passed it would still try to cancel a non-existent progress task and cause an exception.
QueryProviders - Fix split query time-ranges calculation - thanks to @pjain90 for spotting this.
What's Changed
- Set up CI with 1ES Azure Pipelines by @ianhelle in #763
- Update ws_config to handle kqlmagic connection strings by @ianhelle in #767
- Fix split query time-ranges calculation by @ianhelle in #762
- Add support for ruff and u/p devcontainer by @ianhelle in #765
- Add jwt auth token expire check and modify some messages when connecting Splunk by @Tatsuya-hasegawa in #770
- WSConfig updates by @ianhelle in #771
- Pass
true
for props into_build_sent_data
when callingupdate_incident
by @kylelol in #774 - Changing cert thumbprint from Sha1 to Sha256 in Az Kusto driver by @ianhelle in #775
New Contributors
Full Changelog: v2.11.0...v2.12.0
Sentinel Split Query fix
This is a minor release mainly to add a warning for Kusto/Sentinel queries that return partial results.
A close friend of MSTICPy (thx @Cyb3r-Monk) had spotted that MSTICPy does not report partial results when doing split queries so it's possible to lose data from the query range silently.
Due to an unfortunate admin error, the fix for this was committed direct to main, so no PR for this is available. :-(
If you want the query to fail (throw an exception) rather than just warn you can supply a new parameter fail_if_partial
.
This only affects the Sentinel query provider and works for standard as well as split queries.
NOTE: the documentation has a typo and calls this fail_on_commit
- we'll fix that in the next release to support both fail_if_partial
and fail_on_partial
Example
qry_prov.exec_query(query_string, fail_if_partial=True)
What's Changed
- Missing PR for partial query warning and fixes for pandas deprecation warnings See the diff for changes
- Fixing group.apply for pandas < 2.2.1 by @ianhelle in #759
- Added missing quotation in code block by @ryan-aus in #753
- Bump httpx from 0.25.2 to 0.27.0 by @dependabot in #754
- Bump readthedocs-sphinx-ext from 2.2.3 to 2.2.5 by @dependabot in #743
- Updated conda reqs files for new packages by @ianhelle in #758
- Build break fix for splunk SDK by @ianhelle in #760
New Contributors
Full Changelog: v2.10.0...v2.11.0
v2.10.0
What's Changed
- Add nest_asyncio to run threaded queries by @FlorianBracq in #737
- Bump sphinx-rtd-theme from 1.3.0 to 2.0.0 by @dependabot in #738
- Bump httpx from 0.25.0 to 0.25.2 by @dependabot in #736
- Adding Virus Total Search Capabilities by @secops-account in #739
- Add security token auth and credential loading from msticpyconfig.yaml to SplunkUploader by @Tatsuya-hasegawa in #731
- fix: updated _get_query_status in the azure monitor driver by @aka0 in #745
- Added M365DGraph to the supported environments for existing queries by @d3vzer0 in #748
- Small Typo correction in SentinelWatchlists.rst by @Korving-F in #746
- Fix ibm_xforce TI provider for domain names and URLs by @pcoccoli in #749
- Update python-package.yml by @ianhelle in #750
- Ianhelle/aml updates 2024 01 31 by @ianhelle in #751
- Ianhelle/warning fixes 2024 02 11 by @ianhelle in #752
New Contributors
- @secops-account made their first contribution in #739
- @aka0 made their first contribution in #745
- @Korving-F made their first contribution in #746
- @pcoccoli made their first contribution in #749
Full Changelog: v2.9.0...v2.10.0