| Dependency | Reason |
|---|---|
| Dagrun Running | Task instance's dagrun was not in the 'running' state but in the state 'failed'. |
| Task Instance State | Task is in the 'failed' state which is not a valid state for execution. The task must be cleared in order to be run. |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | def generar_archivo_perf_procesado_collahuasi(origin_perf_file_container, origin_perf_file_blob,final_container ,final_blob): """ Función que permite generar un archivo final con los datos de perforación de collahuasi procesados NOTA: FUNCION NO TERMINADA SOLO SE CREO PARA PODER CONTAR CON ARCHIVO PROCESADO DE COLLAHUASI,FALTA AGREGAR DRILLTIME ANTES DE ELIMINAR DUPLICADOS Parámetros ---------- origin_perf_file_container: str Nombre del contenedor donde se encuentra el archivo de perforacion. origin_perf_file_blob: str Nombre del archivo de perforación final_container: str Nombre del contenedor donde se guardará el archivo final. final_blob: str Nombre del archivo final. Returns ------- None """ """ Lectura de archivos """ # Archivo Flanders HHD conn_origin_container = BlobClient.from_connection_string(conn_str=BLOB_CONNECT_STRING, container_name=origin_perf_file_container, blob_name=origin_perf_file_blob) try: download_stream = conn_origin_container.download_blob().readall() data_perf = json.loads(download_stream) df_perf = pd.DataFrame(data_perf['values'], columns=data_perf['headers']) except: download_stream = conn_origin_container.download_blob() df_perf = pd.read_json(download_stream, orient='table') """ Procesamiento de archivos """ #df_Flanders = calculate_drill_time(df_hhd_aux, df_mwd, 'DateTime', ['Drill_Number','OperatorID','HoleNumber']) #Eliminación de registros totalmente suplicados en la fuente df_perf = df_perf.drop_duplicates(keep='first') df_perf_final = df_perf df_perf_final['banco'] = df_perf_final['Malla'].astype(str).str.extract(r'-(.*?)-').astype(float) #Campo Agregado df_perf_final['horaDeMuestra'] = pd.to_datetime(df_perf_final['horaDeMuestra'], errors='coerce') df_perf_final['fechaInicioPozo'] = pd.to_datetime(df_perf_final['fechaInicioPozo'], errors='coerce') df_perf_final['fechaFinPozo'] = pd.to_datetime(df_perf_final['fechaFinPozo'], errors='coerce') df_perf_final['Profundidad'] = pd.to_numeric(df_perf_final['Profundidad'], errors='coerce') df_perf_final['PosicionXPlanificada'] = pd.to_numeric(df_perf_final['PosicionXPlanificada'], errors='coerce') df_perf_final['PosicionYPlanificada'] = pd.to_numeric(df_perf_final['PosicionYPlanificada'], errors='coerce') df_perf_final['DrillTime'] = (df_perf_final['fechaFinPozo']-df_perf_final['fechaInicioPozo']).dt.total_seconds()/60 #df_perf_final['DrillTime'] = (df_perf_final.groupby(subset_clave)['horaDeMuestra'].transform('max') - #df_perf_final.groupby(subset_clave)['horaDeMuestra'].transform('min')).dt.total_seconds()/60 #Eliminación de pozos duplicados segun clave subset_clave = ['banco','PosicionXPlanificada','PosicionYPlanificada'] #list_key df_perf_final = df_perf_final.sort_values(by=['banco','PosicionXPlanificada' ,'PosicionYPlanificada','horaDeMuestra' ,'Profundidad'], ascending=[True,True,True,False,False]) #subset_clave = ['banco','PosicionXPlanificada','PosicionYPlanificada'] df_perf_final = df_perf_final.drop_duplicates(subset=subset_clave, keep='first') ## Configuracion para subir al blob blob = BlobClient.from_connection_string(conn_str=BLOB_CONNECT_STRING, container_name=final_container, blob_name=final_blob) ## Subir la data json_data = df_perf_final.to_json(index=False, orient='table') blob.upload_blob(json_data, overwrite=True) |
| Attribute | Value |
|---|---|
| dag_id | Collahuasi_BrightBoard |
| duration | 0.158249 |
| end_date | 2025-12-26 15:01:33.190853+00:00 |
| execution_date | 2025-12-25T15:00:00+00:00 |
| executor_config | {} |
| generate_command | <function TaskInstance.generate_command at 0x7783d3ac3040> |
| hostname | 447b87b210b3 |
| is_premature | False |
| job_id | 10864 |
| key | ('Collahuasi_BrightBoard', 'BrighBoard_GenerarArchivoPerf', <Pendulum [2025-12-25T15:00:00+00:00]>, 2) |
| log | <Logger airflow.task (INFO)> |
| log_filepath | /usr/local/airflow/logs/Collahuasi_BrightBoard/BrighBoard_GenerarArchivoPerf/2025-12-25T15:00:00+00:00.log |
| log_url | http://localhost:8080/admin/airflow/log?execution_date=2025-12-25T15%3A00%3A00%2B00%3A00&task_id=BrighBoard_GenerarArchivoPerf&dag_id=Collahuasi_BrightBoard |
| logger | <Logger airflow.task (INFO)> |
| mark_success_url | http://localhost:8080/success?task_id=BrighBoard_GenerarArchivoPerf&dag_id=Collahuasi_BrightBoard&execution_date=2025-12-25T15%3A00%3A00%2B00%3A00&upstream=false&downstream=false |
| max_tries | 0 |
| metadata | MetaData(bind=None) |
| next_try_number | 2 |
| operator | PythonOperator |
| pid | 2310917 |
| pool | default_pool |
| prev_attempted_tries | 1 |
| previous_execution_date_success | None |
| previous_start_date_success | None |
| previous_ti | <TaskInstance: Collahuasi_BrightBoard.BrighBoard_GenerarArchivoPerf 2025-12-24 15:00:00+00:00 [failed]> |
| previous_ti_success | None |
| priority_weight | 1 |
| queue | default |
| queued_dttm | 2025-12-26 15:01:30.618272+00:00 |
| raw | False |
| run_as_user | None |
| start_date | 2025-12-26 15:01:33.032604+00:00 |
| state | failed |
| task | <Task(PythonOperator): BrighBoard_GenerarArchivoPerf> |
| task_id | BrighBoard_GenerarArchivoPerf |
| test_mode | False |
| try_number | 2 |
| unixname | airflow |
| Attribute | Value |
|---|---|
| dag | <DAG: Collahuasi_BrightBoard> |
| dag_id | Collahuasi_BrightBoard |
| depends_on_past | False |
| deps | {<TIDep(Not In Retry Period)>, <TIDep(Trigger Rule)>, <TIDep(Previous Dagrun State)>} |
| do_xcom_push | True |
| downstream_list | [] |
| downstream_task_ids | set() |
| None | |
| email_on_failure | True |
| email_on_retry | True |
| end_date | None |
| execution_timeout | None |
| executor_config | {} |
| extra_links | [] |
| global_operator_extra_link_dict | {} |
| inlets | [] |
| lineage_data | None |
| log | <Logger airflow.task.operators (INFO)> |
| logger | <Logger airflow.task.operators (INFO)> |
| max_retry_delay | None |
| on_failure_callback | None |
| on_retry_callback | None |
| on_success_callback | None |
| op_args | [] |
| op_kwargs | {'origin_perf_file_container': 'raw', 'origin_perf_file_blob': 'Collahuasi/2026-01-02/data_drill.json', 'final_container': 'processed', 'final_blob': 'Collahuasi/BrightBoard/Drills/2026-01-02/data_drill.json'} |
| operator_extra_link_dict | {} |
| operator_extra_links | () |
| outlets | [] |
| owner | Carlos |
| params | {} |
| pool | default_pool |
| priority_weight | 1 |
| priority_weight_total | 1 |
| provide_context | False |
| queue | default |
| resources | None |
| retries | 0 |
| retry_delay | 0:05:00 |
| retry_exponential_backoff | False |
| run_as_user | None |
| schedule_interval | 0 15 * * * |
| shallow_copy_attrs | ('python_callable', 'op_kwargs') |
| sla | None |
| start_date | 2023-11-06T00:00:00+00:00 |
| subdag | None |
| task_concurrency | None |
| task_id | BrighBoard_GenerarArchivoPerf |
| task_type | PythonOperator |
| template_ext | [] |
| template_fields | ('templates_dict', 'op_args', 'op_kwargs') |
| templates_dict | None |
| trigger_rule | all_success |
| ui_color | #ffefeb |
| ui_fgcolor | #000 |
| upstream_list | [] |
| upstream_task_ids | set() |
| wait_for_downstream | False |
| weight_rule | downstream |